Введение в Python для наук о данных
Шиловских Петр Александрович
- The main objective of the course is to provide students with the basic concepts of Python, its syntax, functions and packages to enable them to write scripts for data manipulation and analysis. The course develops skills of writing and running a code using Python. The course covers various variables types and their features, basic operators and statements, loops, as well as the main packages for data science: NumPy, Pandas, Matplotlib. At the end of the course, students should be able to write short scripts to import, prepare and analyze data.
- Know basic data types in Python.
- Know operators, how to clean and merge datasets.
- Know pandas library, the main methods for DataFrames.
- Know how to import data in Python.
- Know how to work in Jupyter Notebook.
- Introduction to Python for Data Science1. Introduction to Python for Data Science. An introduction to the basic concepts of Python. Learn how to use Python interactively and by using a script. Create variables and acquaint with Python's basic data types. Learn to store, access, and manipulate data in lists. Learn how to use functions, methods, and packages. NumPy is a fundamental Python package to efficiently practice data science. Learn to work with powerful tools in the NumPy array, and get started with data exploration. https://www.datacamp.com/courses/intro-to-python-for-data-science
- Intermediate Python for Data Science2. Data types and operators. Building of various types of plots, and customizing them to be visually appealing and interpretable. Learn about the dictionary, an alternative to the Python list, and the pandas DataFrame. Creating and manipulating datasets, access the information from these data structures. Boolean logic is the foundation of decision-making in Python programs. Learn about different comparison operators, how to combine them with Boolean operators, and how to use the Boolean outcomes in control structures. Filtering data in pandas DataFrames using logic. While and for loops. https://www.datacamp.com/courses/intermediate-python-for-data-science 3. Cleaning and merging data. Cleaning and merging data, diagnosing issues such as outliers, missing values, and duplicate rows. https://www.datacamp.com/courses/cleaning-data-in-python
- Python DataFrames4. Analysis, selection, and visualization techniques with Pandas DataFrames. https://www.datacamp.com/courses/pandas-foundations 5. Extracting and transforming DataFrames, advanced indexing, rearranging and reshaping data. https://www.datacamp.com/courses/manipulating-dataframes-with-pandas 6. Merging DataFrames with pandas https://www.datacamp.com/courses/merging-dataframes-with-pandas
- Importing Data in Python7. Importing Data in Python. The ways to import data into Python: from flat files such as .txt and .csv; from files native to other software such as Excel spreadsheets, Stata, SAS, and MATLAB files; and from relational databases such as SQLite and PostgreSQL. https://www.datacamp.com/courses/importing-data-in-python-part-1
- Environment for scientific programming in Python8. Jupiter Notebook as an environment for scientific programming in Python, its structure and features.
- Self-study work
- ExamFinal student assessment is a project, that is performed in a team of no more than 2 people. Each team uses provided dataset of collets their own data, define research question and apply one or a combination of the learnt methods of data analysis with Spreadsheets. As a result of the project each team write down the report and prepare working file. The grade for the exam includes the grade for the report, grade for the working file and the grade for answering questions.
- Vanderplas, J. T. (2016). Python Data Science Handbook : Essential Tools for Working with Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1425081
- Seemon Thomas. (2014). Basic Statistics. [N.p.]: Alpha Science Internation Limited. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1663598