The NASA Exoplanet Archive is a database of exoplanets and their host stars. It contains information on over 5,000 confirmed exoplanets and thousands of candidates. The data is collected from a variety of sources and is updated regularly. The archive is a valuable resource for scientists and anyone interested in learning more about exoplanets.
NASA Explonet Data
Sure, I can help you with that. Here is the copyedited text:
Exoplanet data
- Name: Name of the exoplanet.
- Mass (MJ): Mass of the exoplanet in terms of Jupiter masses (MJ).
- Radius (RJ): Radius of the exoplanet in terms of Jupiter radii (RJ).
- Period (days): Orbital period of the exoplanet in days.
- Semi-major axis (AU): Semi-major axis of the exoplanet's orbit in astronomical units (AU).
- Temp: Temperature of the exoplanet in kelvins (K).
- Discovery method: Method used to discover the exoplanet.
- Disc. Year: Year of discovery.
- Distance (ly): Distance from Earth in light years (ly).
- Host star mass (M☉): Mass of the host star in terms of solar masses (M☉).
Analysis
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
def clean_value(value):
if not isinstance(value, str):
return value
value = value.replace(',', '')
if '±' in value:
return float(value.split('±')[0])
df = pd.read_csv('exoplanets.csv')
# when were they discovered?
year = 'Disc. Year'
max_year = int(df[year].max())
min_year = int(df[year].min())
df[[year]].plot.hist(bins=1 + max_year - min_year, legend=True)
# how were they discovered?
method = 'Discovery method'
df[method].value_counts().plot(kind='pie')
# do we expect to see a correlation between mass and distance?
distance = 'Distance (ly)'
mass = 'Mass (MJ)'
df[distance] = df[distance].apply(clean_value)
df[mass] = df[mass].apply(clean_value)
sns.scatterplot(
data=df, x=distance, y=mass, hue=method, palette='tab10', legend=False
)
plt.title('Mass vs. Distance')
plt.show()
# how about mass vs radius?
radius = 'Radius (RJ)'
df[radius] = df[radius].apply(clean_value)
sns.scatterplot(
data=df, y=mass, x=radius, hue=method, palette='tab10', legend=False
)
plt.title('Mass vs. Radius')
plt.show()
# maybe a pairplot will help
sns.pairplot(
data=df[[mass, radius, distance, method]], diag_kind='kde', palette='tab10'
)
plt.title('Pairplot of exoplanet properties')
plt.show()
# let's look directly at the correlations
correlations = df[[mass, radius, distance, method]].corr()
sns.heatmap(correlations, cmap='coolwarm')
plt.title('Correlations between exoplanet properties')
plt.show()
print(correlations)
Top comments (0)