• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Introduction to Python for Data Science

2019/2020
Учебный год
ENG
Обучение ведется на английском языке
4
Кредиты
Статус:
Курс по выбору
Когда читается:
4-й курс, 1 модуль

Преподаватель

Программа дисциплины

Аннотация

The course provides students with wide general overview of Python – a general-purpose programming language that is becoming ever more popular for data science. The focus is on the application of Python specifically for data science. The course is about ways to import, store and manipulate data, and helpful data science tools to conducting data analyses. The course is intended for students with little programming background. The learning process is facilitated with DataCamp platform.
Цель освоения дисциплины

Цель освоения дисциплины

  • The main objective of the course is to provide students with the basic concepts of Python, its syntax, functions and packages to enable them to write scripts for data manipulation and analysis. The course develops skills of writing and running a code using Python. The course covers various variables types and their features, basic operators and statements, loops, as well as the main packages for data science: NumPy, Pandas, Matplotlib. At the end of the course, students should be able to write short scripts to import, prepare and analyze data.
Результаты освоения дисциплины

Результаты освоения дисциплины

  • Know basic data types in Python.
  • Know operators, how to clean and merge datasets.
  • Know pandas library, the main methods for DataFrames.
  • Know how to import data in Python.
  • Know how to work in Jupyter Notebook.
Содержание учебной дисциплины

Содержание учебной дисциплины

  • Introduction to Python for Data Science
    1. Introduction to Python for Data Science. An introduction to the basic concepts of Python. Learn how to use Python interactively and by using a script. Create variables and acquaint with Python's basic data types. Learn to store, access, and manipulate data in lists. Learn how to use functions, methods, and packages. NumPy is a fundamental Python package to efficiently practice data science. Learn to work with powerful tools in the NumPy array, and get started with data exploration. https://www.datacamp.com/courses/intro-to-python-for-data-science
  • Intermediate Python for Data Science
    2. Data types and operators. Building of various types of plots, and customizing them to be visually appealing and interpretable. Learn about the dictionary, an alternative to the Python list, and the pandas DataFrame. Creating and manipulating datasets, access the information from these data structures. Boolean logic is the foundation of decision-making in Python programs. Learn about different comparison operators, how to combine them with Boolean operators, and how to use the Boolean outcomes in control structures. Filtering data in pandas DataFrames using logic. While and for loops. https://www.datacamp.com/courses/intermediate-python-for-data-science 3. Cleaning and merging data. Cleaning and merging data, diagnosing issues such as outliers, missing values, and duplicate rows. https://www.datacamp.com/courses/cleaning-data-in-python
  • Python DataFrames
    4. Analysis, selection, and visualization techniques with Pandas DataFrames. https://www.datacamp.com/courses/pandas-foundations 5. Extracting and transforming DataFrames, advanced indexing, rearranging and reshaping data. https://www.datacamp.com/courses/manipulating-dataframes-with-pandas 6. Merging DataFrames with pandas https://www.datacamp.com/courses/merging-dataframes-with-pandas
  • Importing Data in Python
    7. Importing Data in Python. The ways to import data into Python: from flat files such as .txt and .csv; from files native to other software such as Excel spreadsheets, Stata, SAS, and MATLAB files; and from relational databases such as SQLite and PostgreSQL. https://www.datacamp.com/courses/importing-data-in-python-part-1
  • Environment for scientific programming in Python
    8. Jupiter Notebook as an environment for scientific programming in Python, its structure and features.
Элементы контроля

Элементы контроля

  • DataCamp (неблокирующий)
  • Exam (неблокирующий)
Промежуточная аттестация

Промежуточная аттестация

  • Промежуточная аттестация (1 модуль)
    0.5 * DataCamp + 0.5 * Exam
Список литературы

Список литературы

Рекомендуемая основная литература

  • Vanderplas, J. T. (2016). Python Data Science Handbook : Essential Tools for Working with Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1425081

Рекомендуемая дополнительная литература

  • Seemon Thomas. (2014). Basic Statistics. [N.p.]: Alpha Science Internation Limited. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1663598