Call/WhatsApp/Text: +44 20 3289 5183

Question: Python's Pandas and Matplotlib Libraries

20 Sep 2023,12:06 PM


Your final document should be a Jupyter notebook.   (i.e.-download or save as an .ipynb)

In answering each questions please include: 

a) the question as a markdown header in your Jupyter notebook, 

b)  the raw code that you used to generate any results, tables, or figures, 

c) the top ten or fewer rows of the dataframe (do not include more than ten rows for any table in your report).

Include any plots or figures generated from your code as well.

Problem set is in HW1.docx

see course slides in another attach file

Using Google Colab Jupyter notebook from Google Drive



HW 1: Getting to know Python's Pandas and Matplotlib libraries

Your final document should be a Jupyter notebook.   (i.e.-download or save as an .ipynb).  The file should be uploaded to this assignment on the course website.  

In answering each of the following questions please include a) the question as a markdown header in your Jupyter notebook, b)  the raw code that you used to generate any results, tables, or figures, and c) the top ten or fewer rows of the dataframe (do not include more than ten rows for any table in your report).

Include any plots or figures generated from your code as well.

Part A:

1. Find the url for the mtcars dataset from the following website:

Read through the "DOC" file to understand the variables in the dataset, then use the following url to import the data using pandas read_csv function.

2.  Display the first five rows of the data.

3.  Calculate the average of the mpg column for all cars within each category of the cyl column.

4.  Create a histogram using the mpg column

5. Choose two variables in the data and create a scatterplot.

Part B (Repeat some of this basic code using new data!):

1. Find a tabular dataset that interests you that has "tidy" data.  (Tidy data has data that is ready for your data analysis.  For our tasks we want data with columns representing X and y data where columns represent variables and rows representing non repeating observations.  Give a brief description of the dataset.  Provide a citation of the dataset (any format is fine.)

2. Display the first five rows of the data.

3. Create a visualization using one (or two) variables from this data.

Expert answer


This Question Hasn’t Been Answered Yet! Do You Want an Accurate, Detailed, and Original Model Answer for This Question?


Ask an expert

Stuck Looking For A Model Original Answer To This Or Any Other

Related Questions

What Clients Say About Us

WhatsApp us