</div>
Here we will work with public Data from airbnb, released under License CC0: Public Domain, which comes from Kaggle
Since 2008, guests and hosts have used Airbnb to expand on traveling possibilities and present more unique, personalized way of experiencing the world. This dataset describes the listing activity and metrics in NYC, NY for 2019.
This data file includes all needed information to find out more about hosts, geographical availability, necessary metrics to make predictions and draw conclusions.
Part of this notebook is based on the work of Dgomonov, published in Kaggle under the Apache 2.0 open source license, adapted by Juan Carlos Basto Pineda. I encourage you to check out the original notebook for further details.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
airbnb=pd.read_csv("data/AB_NYC_2019.csv")
# Your Answer Here
#checking type of every column in the dataset
airbnb.dtypes
#dropping columns we are not interested in (or for ethical reasons)
airbnb.drop(['id','host_name','last_review'], axis=1, inplace=True)
#examing the changes
airbnb.head(3)
airbnb.isnull().sum()
airbnb.shape
airbnb.fillna({'reviews_per_month':0}, inplace=True)
#let's proceed with examing some interesting categorical unique values
airbnb.neighbourhood_group.unique()
airbnb.neighbourhood_group.value_counts()
airbnb.room_type.unique()
neighbourhood
?# Your Answer Here
# Your Answer Here
top_host_df = pd.DataFrame(top_host)
top_host_df.reset_index(inplace=True)
top_host_df.rename(columns={'index':'Host_ID', 'host_id':'P_Count'}, inplace=True)
top_host_df
sns.barplot(x="Host_ID", y="P_Count", data=top_host_df,palette='Blues_d')
plt.xticks(rotation=45)
sns.boxplot(x = "neighbourhood_group", y='price', data = airbnb)
plt.xticks(rotation=45)
airbnb.corr()