# Documentation & Reading

This page collects various documentation and readings. Many of these, or portions of them, are
linked to from content in individual weeks, but they are re-linked here for your convenient
reference. This is a mix of material from others (most of it) and that I have written.

## Course Textbooks

-   :::{book} p4da
    :title: Python for Data Analysis
    :edition: 2nd Edition
    :author: Wes McKinney
    :publisher: O'Reilly
    :isbn: 978-1491957660

    You can read this book for free [through the Boise State Library](https://boisestate.on.worldcat.org/oclc/1005140249).
    :::

-   :::{book} tlds
    :title: Think Like a Data Scientist
    :author: Brian Godsey
    :publisher: Manning
    :isbn: 978-1633430273

    You can read this book for free [through the Boise State Library](https://boisestate.on.worldcat.org/v2/oclc/984515080).
    :::

An additional book you may find useful:

-   :::{book} hands
    :title: A Hands-On Introduction to Data Science
    :author: Chirag Shah
    :publisher: Cambridge

    :::

## Software Documentation

Quick links to software documentation:

-   [Python](https://docs.python.org/3/)
-   [Pandas](http://pandas.pydata.org/pandas-docs/stable/)
-   [Seaborn](https://seaborn.pydata.org/)
-   [Matplotlib](https://matplotlib.org/)
-   [Scikit-Learn](https://scikit-learn.org/stable/user_guide.html)
-   [Learning the Shell](http://linuxcommand.org/lc3_learning_the_shell.php)

## Statistics

More reading on probability and statistics:

- My [notes on probability](probability.md)
- [<cite class=free>NIST/SEMATECH e-Handbook of Statistical Methods</cite>](https://www.itl.nist.gov/div898/handbook/)

## Visualization

- [Seaborn gallery](https://seaborn.pydata.org/examples/index.html)
- [Seaborn tutorial](https://seaborn.pydata.org/tutorial.html) — organized topically, **very good resource**
- [Matplotlib gallery](https://matplotlib.org/gallery.html)
- [Plotnine gallery](https://plotnine.readthedocs.io/en/stable/gallery.html)
- [My plot utilities](https://md.ekstrandom.net/blog/2020/09/plots) (for preparing papers with `plotnine`)
- [<cite>W.E.B. Du Bois's Data Portraits: Visualizing Black America</cite>](https://boisestate.on.worldcat.org/v2/oclc/1023487386), edited by Whitney Battle-Baptiste. Historical data visualizations.

## Python

Further resources for the Python programming language:

-   <cite>Learn Python the Hard Way</cite> by Zed Shaw. More thorough treatment of Python.
-   <cite>Fluent Python</cite>. Learn advanced and idiomatic Python.

## Social Aspects

-   <cite>Data Feminism</cite> (<a class=free href="https://data-feminism.mitpress.mit.edu/">online version</a>) by Catherine D'Ignazio and Lauren F. Klein. Critical perspectives on data.
-   Olteanu, Alexandra and Castillo, Carlos and Diaz, Fernando and Kiciman, Emre, Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries (December 20, 2016). Frontiers in Big Data 2:13. doi: 10.3389/fdata.2019.00013, Available at SSRN: http://dx.doi.org/10.2139/ssrn.2886526

## Writing

Writing is a part of this class, but will play a particularly important role in your graduate career.
Hopefully these resources are helpful:

-   [<cite>Style: Lessons in Clarity and Grace</cite>](https://boisestate.on.worldcat.org/v2/oclc/919068263) — one of the best books I know to help you improve your writing.

## Diving Deeper

These resources will help you explore further some thing we touch on in this class, or to further expand your knowledge:

-   [<cite>W.E.B. Du Bois's Data Portraits: Visualizing Black America</cite>](https://boisestate.on.worldcat.org/v2/oclc/1023487386), edited by Whitney Battle-Baptiste. Historical data visualizations.
-   [<cite>Statistics Done Wrong: The Woefully Complete Guide</cite>](http://www.worldcat.org/oclc/891609129), by Alex Reinhart.  Also available in the O'Reilly Learning Center.
-   [<cite>How to Lie with Statistics</cite>](http://www.worldcat.org/oclc/1014298802) by Darrell Huff.
-   <cite>Counterfactuals and Causal Inference</cite>, 2nd Edition, by Stephen L. Morgan and Christopher Winship.
-   [<cite>An Introduction to Statistical Learning</cite>](https://www.statlearning.com/), by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.
