Tuesday, April 4, 2017

Air Quality In Poland #09 - Is it worst in Rybnik?

As I promised in last post, I should add calculation of overall air quality for each time point. Such quality is defined by worst category of quality for each measured pollutant. In order to determine this category, we need to examine each row and put proper category based on descriptive values:

 for quality in qualities:  
   reducedDataFrame.loc[(reducedDataFrame[["C6H6.desc", "CO.desc", "NO2.desc", "O3.desc", "PM10.desc",   
                     "PM25.desc", "SO2.desc"]] == quality).any(axis=1),"overall"] = quality  

It might be not optimal procedure, but it seems to be quite fast, at least on reduced data frame. Since our qualities are sorted, if there is worse value in following iterations, this worse value is overwriting previous value in overall column:

 qualities = sorted(descriptiveFrame.index.get_level_values(1).unique().tolist())  

After generating additional column we need also to concatenate it with descriptive data frame

 overall = reducedDataFrame.groupby(level="Station")["overall"].value_counts(dropna =   
                                       False).apply(lambda x: (x/float(hours))*100)  
 descriptiveFrame = pd.concat([descriptiveFrame, overall], axis=1)  
 descriptiveFrame.rename(columns={0: "overall"}, inplace=True)  

And what are the results?

 LuZarySzyman NaN          NaN  
        1 Very good   9.601553  
        2 Good     57.266811  
        3 Moderate   26.955132  
        4 Sufficient   3.482133  
        5 Bad      0.890513  
        6 Very bad    0.308254  
 MzLegZegrzyn NaN          NaN  
        1 Very good   1.255851  
        2 Good     50.941888  
        3 Moderate   31.693116  
        4 Sufficient   8.425619  
        5 Bad      3.950223  
        6 Very bad    2.580203  
 MzPlocKroJad NaN          NaN  
        1 Very good   21.965978  
        2 Good     60.806028  
        3 Moderate   15.983560  
        4 Sufficient   0.947597  
        5 Bad      0.102751  
        6 Very bad    0.011417  
 OpKKozBSmial NaN          NaN  
        1 Very good   2.922708  
        2 Good     54.446855  
        3 Moderate   30.117593  
        4 Sufficient   6.302089  
        5 Bad      4.144309  
        6 Very bad    2.009362  
 PmStaGdaLubi NaN          NaN  
        1 Very good   43.155611  
        2 Good     38.075123  
        3 Moderate   12.204590  
        4 Sufficient   3.539217  
        5 Bad      1.758192  
        6 Very bad    1.164516  
 SlRybniBorki NaN          NaN  
        1 Very good   1.541272  
        2 Good     56.444800  
        3 Moderate   27.662975  
        4 Sufficient   6.781596  
        5 Bad      3.242379  
        6 Very bad    3.014043  
 Name: overall, dtype: float64  

It seems that amount of very bad data points in Rybnik are without change. But for example OpKKozBSmial data station has 2.009362 percent of very bad data points, but individually worst pollutant there has 1.198767 percent of very bad air quality time. So it seems that other pollutants are also significant - which is true with values 0.913346 and 0.399589 there.

Next post - looking for beast air quality place in Poland. I hope that my laptop would not explode.

No comments:

Post a Comment