Plotly Express#
Background#
Plotly was founded in 2012, by Canadian software engineers, burst into the scene at PyCon in 2013. The original differentiator was
Plotly is very popular among new python users.
Data#
Weekly Retail Food Sales from USDA.
I picked this dataset because it was the first dataset I found on Google. It’s an Excel file, but we’ll get to explore some Polars read_excel()
functions
https://www.ers.usda.gov/data-products/weekly-retail-food-sales/
import polars as pl
df = pl.read_excel(
"data/NationalTotalAndSubcategory.xlsx",
read_csv_options={
"skip_rows": 1,
"null_values": "NA"
}
)
When reading Excel files:#
Always check head() and tail()
df
shape: (193, 11)
Date | Dollars | Dollars last year | Dollars 3 years ago | Unit sales | Unit sales last year | Unit sales 3 years ago | Percent change dollars 1 year | Percent change units 1 year | Percent change dollars 3 years | Percent change units 3 years |
---|---|---|---|---|---|---|---|---|---|---|
str | i64 | i64 | str | i64 | i64 | str | f64 | f64 | str | str |
"2019-10-06" | 13043274747 | 12832064544 | null | 4473321014 | 4502907925 | null | 1.6 | -0.7 | null | null |
"2019-10-13" | 13091696087 | 12739265424 | null | 4501526226 | 4476085741 | null | 2.8 | 0.6 | null | null |
"2019-10-20" | 12817973722 | 12432656011 | null | 4368604765 | 4336490063 | null | 3.1 | 0.7 | null | null |
"2019-10-27" | 12718410768 | 12368178414 | null | 4326635282 | 4288580030 | null | 2.8 | 0.9 | null | null |
"2019-11-03" | 13053018493 | 12752512115 | null | 4394371559 | 4398657676 | null | 2.4 | -0.1 | null | null |
"2019-11-10" | 12930706873 | 12700012879 | null | 4456077366 | 4466802183 | null | 1.8 | -0.2 | null | null |
"2019-11-17" | 12921543187 | 13797643275 | null | 4429241111 | 4795929252 | null | -6.3 | -7.6 | null | null |
"2019-11-24" | 13988322777 | 13486661480 | null | 4755163500 | 4603055976 | null | 3.7 | 3.3 | null | null |
"2019-12-01" | 13840106375 | 11816453042 | null | 4621273384 | 4129599448 | null | 17.1 | 11.9 | null | null |
"2019-12-08" | 12595884386 | 12985481234 | null | 4304472509 | 4518769338 | null | -3.0 | -4.7 | null | null |
"2019-12-15" | 13358331801 | 13037915930 | null | 4507643605 | 4471273642 | null | 2.5 | 0.8 | null | null |
"2019-12-22" | 15186042814 | 14987301886 | null | 4947977096 | 4966518760 | null | 1.3 | -0.4 | null | null |
… | … | … | … | … | … | … | … | … | … | … |
"2023-03-26" | 16274613840 | 15253549272 | "14372804780" | 4279415118 | 4375869923 | "4662786695" | 6.7 | -2.2 | "13.2" | "-8.19999999999… |
"2023-04-02" | 16549773889 | 15448566490 | "15545074300" | 4349695213 | 4422104047 | "4995525953" | 7.1 | -1.6 | "6.5" | "-12.9" |
"2023-04-09" | 18613531784 | 16043007156 | "15808110159" | 4871689701 | 4577588527 | "5066490184" | 16.0 | 6.4 | "17.7" | "-3.8" |
"2023-04-16" | 16298798999 | 17292495342 | "14915492824" | 4287683374 | 4902751831 | "4748699131" | -5.7 | -12.5 | "9.300000000000… | "-9.69999999999… |
"2023-04-23" | 16322305069 | 15067292480 | "15002171687" | 4275687457 | 4285501343 | "4747494044" | 8.3 | -0.2 | "8.800000000000… | "-9.9" |
"2023-04-30" | 16191226951 | 15390616073 | "15560426999" | 4238126460 | 4348846639 | "4878708893" | 5.2 | -2.5 | "4.099999999999… | "-13.1" |
"2023-05-07" | 16915806638 | 16244713690 | "15825035903" | 4416363388 | 4520591185 | "4934237181" | 4.1 | -2.3 | "6.9" | "-10.5" |
"NA = data are … | null | null | null | null | null | null | null | null | null | null |
"Note: The seri… | null | null | null | null | null | null | null | null | null | null |
"Source: USDA, … | null | null | null | null | null | null | null | null | null | null |
"Data as of May… | null | null | null | null | null | null | null | null | null | null |
"Errata: On Jul… | null | null | null | null | null | null | null | null | null | null |
import plotly.express as px
Bar Graph#
df = px.data.tips()
fig = px.histogram(df, x="total_bill")
fig.show()