Description
This course is designed to introduce you to the world of data analysis and reporting with Python. The first approach with the data must be to read it, clean it, and understand its content. Finally, after a numerical and graphical analysis, communicate results through beautiful presentations and reports. This is exactly what this course is about.
The course covers the basics of Python, the main libraries for data analysis, and how to create reports and visualizations. The course is designed to be interactive and practical. The student will learn by doing, by solving real business problems.
Topics include
- Origins and key features of Python
- Editors and platforms: what’s available and how to choose
- Installing and configuring the Python environment with Anaconda
- Visual Studio, Jupyter, JupyterLab
- Virtual environment: what it is, what it’s used for, and how to configure it
- Introduction to Git, GitHub, and GitLab
- Introduction to data analysis
- Introduction to major Python libraries for data analysis: NumPy and Pandas
- What is vectorial programming and how it improves readability and performance
- NumPy arrays: what they are, why they are useful, and key features
- Vector and Matrix operations
- Tensor operations
- Practical examples of using NumPy
- Introduction to Pandas
- Pandas: subsetting, selecting, filtering, sorting, unique values, aggregations, basic statistics
- Reading and writing data: text files, Excel, database connections, JSON, XML
- Data preparation and cleaning: handling missing data, replacing values, normalization, deduplication, permutations, random sampling, identifying and removing outliers, computing indicators and dummy variables, discretization
- Working with datasets: reshaping, merging, joining, summary statistics
- Introduction to Plotnine
- Creating basic charts with Plotnine
- Customizing charts with Plotnine
- Data aggregation and group analysis
- Working with dates and timelines
- Introduction to the Plotly library: creating interactive visualizations in the browser
- Export Jupiter Notebooks into beautiful reports
What you will be able to do
- Use data from several data sources
- Tidy your dataset
- Discover relations among your data
- Create visualizations
- Fit a model for your data
- Deliver insights and results with a clear report or presentation
- Quickly create visualization to understand the data set
- Graphically highlight relationships in data
- Choose the best representation for the data types you have
- Use specific plot for Graphs or other visualizations
- Present results either as PDF, verbose reports, Dashboards or Slides
Duration
3 days
Prerequisites
None.
Audience
This course is a fundamental for every business area. Different example datasets can be used according to industry type, for a better understanding and faster use of the concepts.