9b Data Visualization

Cards (11)

  • Data visualization (data viz)

    Representing data visually, e.g., through graphs, plots, charts
  • matplotlib library

    Popular method of creating plots in Python, but plots aren't visually appealing unless you write a lot of code
  • seaborn library

    Uses matplotlib functions "under the hood", can easily create attractive plots with very little code
  • To use seaborn functions, we need to import both matplotlib and seaborn
  • Scatter plots
    Useful for understanding the relationship between two variables, created using either scatterplot() or regplot()
    Syntax: plot_name = sns.scatterplot(data = df, x = ‘col1’, y = ‘col2’)
  • Scatter plot with regplot()

    Plots a regression line, can make it easier to visualize whether you have a positive or negative linear relationship between variables
    Syntax: plot_name = sns.regplot(data = df, x = ‘col1’, y = ‘col2’)
  • Histograms are used to visualize the distribution of continuous data, count plots are used to visualize counts of categorical data
  • Histograms
    Created using the function histplot()
    Syntax: plot_name = sns.histplot(data = df, x = ‘col’)
  • In plots you can change the color or marker using the syntax: plot_name = sns.regplot(data = df, x = ‘col1’, y = ‘col2’, color = ' ' ) plot_name = sns.regplot(data = df, x = ‘col1’, y = ‘col2’, marker = ' ' )
  • Long format
is best for point plots, bar plots
  • Wide format
is best for scatter/regression plots, histograms, count plots