Computer Networking
Databases
Languages
Production Software
Real_Time_Systems
Security

Lec 12 Numpy, OS, Stats

Numpy

data = {
    'title': ['The Matrix', 'Inception', 'Interstellar', 'Blade Runner'],
    'year': [1999, 2000, 2001, 2003, 1984],
    'rating': [8.7, 8.8, 8.6, 4.0, 5.6]
}
df=pd.DataFrame(data)
# print(np.mean(df['rating']))
for index, row in df.iterrows():
    print(f'{row["title"]} was released in 1999')

to add a row, create another dataframe and concatenate

new_movie = pd.DataFrame({
    'title': ['Dune'],
    'year': [2021],
    'rating': [8.0]
})
df = pd.concat([df, new_row], ignore_index=True)

to add a column:

df_people = pd.DataFrame({
    'Person': []
})

Raising Errors:

raise BlahBlahError

OS

os.path.exists("path/file")
os.path.join("base/path", 'name')
os.listdir('path/dir')

Statistics

  • Correlation: When two variables appear to be related in some way
  • Find correlation coeff:
df = pd.DataFrame({
    'age': [13, 19, 3],
})
print(np.corrcoef(df['age'], df['Height']))
"""

"""
#corrcoef provides the following:
"""
[[corr(x,x)], corr(x,y)]
[corr(y,x)], corr(y,y)]]
"""
  • Histograms: 1 variable
  • Bar graphs, scatterplots: 2 variables