Posts

Final Project

Image
Problem Description The mtcars dataset contains various car attributes such as miles per gallon (mpg), number of cylinders (cyl), horsepower (hp), and weight (wt), among others. Each row in the dataset represents a different car model. The problem we’re addressing here is to understand how these attributes relate to each other. Specifically, we’re interested in the relationship between the number of cylinders a car has (cyl) and its fuel efficiency, measured in miles per gallon (mpg). The number of cylinders in a car is a key factor that can influence its performance characteristics, including its fuel efficiency. Cars with more cylinders tend to have more power, which can result in lower fuel efficiency. However, this is not always the case as other factors such as the car’s weight, aerodynamics, and engine technology can also play a role. By visualizing and analyzing the data, we aim to gain insights into these relationships. This could help car manufacturers design more fuel-efficie...

Module 13 Assignment

Image
 For this weeks assignment, I created a simple animation that displays 10 frames. Each frame is a scatter plot of 10 randomly generated numbers between 0 and 1. The following is the code used: library(animation) saveGIF({   for (i in 1:10) {     plot(runif(10), ylim = 0:1)     ani.pause()   } }, movie.name = "scatter_plot_random.gif") This generates the following animation:  

Module 12 Assignment

Image
For this assignment, I chose to use R Studio with ggnet2: network visualization with ggplot2, to create a visual social network analysis. Below is the code I created for it using the provided libraries and also the "igraph" library to generate a random graph and then using "ggnet2" to visualize it.  library(GGally) library(network) library(sna) library(igraph) library(ggplot2) # Generate a random graph with 10 nodes and a 30% chance of an edge between nodes g <- erdos.renyi.game(10, 0.3, type = "gnp") # Convert the igraph object to a network object net <- network::as.network.matrix(as_adjacency_matrix(g)) # Visualize the network ggnet2(net, mode = "fruchtermanreingold", size = "degree", label = TRUE) In this example, the mode parameter is set to "fruchtermanreingold", which means that the Fruchterman-Reingold algorithm is used for the layout of the network. The size parameter is set to "degree", which means tha...

Module 11 assignment

Image
The visual I created is a dot-dash plot of per capita budget expenditures in constant dollars from 1967 to 1977. The dots represent the data points, and the lines connect the dots. The horizontal dashed lines at 5% and 6% are added for reference. The text on the right side of the plot provides additional information about the data.

Module 10 Assignment

Image
Each of the provided reading resources was incredibly insightful into how ggplot2 is used as a whole with visualization. The visualization I have created uses the airquality dataset that is built into R studio.  library(ggplot2) head(airquality) airquality <- na.omit(airquality) ggplot(airquality, aes(x=Day, y=Ozone)) +   geom_line() +   labs(x="Day", y="Ozone Reading",        title="Time Series Plot of Ozone Readings") In this plot, the x-axis represents the day and the y-axis represents the Ozone readings. The line represents the trend of Ozone readings over time. Now, let’s discuss the importance of visualization in time series analysis: Trend Identification: Visualizations can help identify overall trends in the data. For example, in the plot above, you might be able to see if the Ozone readings are generally increasing, decreasing, or staying the same over time. Seasonality Detection: In some datasets, there might be patterns that repeat at reg...

Module #7 Assignment

Image
This code will create a boxplot of the mpg variable, grouped by the number of cylinders (cyl). The geom_boxplot() function from the ggplot2 package is used to create the boxplot. The labs() function is used to add labels to the x and y axes, and ggtitle() is used to add a title to the plot. Regarding Few’s recommendations, they are generally well-regarded in the field of data visualization. Using a grid to enhance comparisons between scatter plots can indeed be helpful. It allows the viewer to more easily compare the distributions of different variables or groups. However, like all recommendations, it may not be applicable in every situation. It’s always important to consider the specific context and goals of your visualization when deciding which techniques to use.

Module #6 Assignment

Image
 I do have some experience when it comes to data visualization in R that made this assignment work out much more smoothly than expected.  My R code for this assignment was the following: Running this code generates a bar plot that looks like the following: Both Few and Yau emphasize the importance of simplicity and clarity in data visualization. I feel my example does this perfectly as it captures the full range of information necessary in a way that is simple and clean.