DRA-chaos / DSP-615-Final-Project-Spectroscopic-classification-of-Photometric-data-from-SDSS-DR-17

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DSE-615-Spectroscopic-classification-of-Photometric-data-from-the-SDSS-DR17

The Sloan Digital Sky Survey (SDSS) Data Release 17 (DR17) catalog contains photometric data for all objects viewed through a telescope and spectroscopic data for a small part of these. On the tagged photometric data that has been spectroscopically classified using labels, I trained ML classification models. The EDA aids in selecting the appropriate models for the dataset. On fresh, unclassified data, I used the learned models with the highest accuracy on the training set. The three types of astronomical transients namely : stars, galaxies, and quasars have been predicted using a variety of machine learning methods. I have taken into account the classification techniques KNN, SVM, Random Forest and Decision Tree. Foraying into Neural Networks, I have also experimented with the Multilayer Perceptron Based Classifier. Before foraying into statistical analysis, a vigorous pictorial analysis was done through Violin Plots, KDE distributions, PDF function distributions and frequency polygons to name a few. This model would be ideal to get a thorough understanding of the data visually, statistically, mathematically as well as analytically.

The classification models under consideration are :

  • k-Nearest Neighbours.
  • Decision Trees.
  • Random Forest.
  • Logistic Regression
  • Naive Bayesian classification
  • Support Vector Machines
  • Multi Layer Perceptron Classifier.

image

This study documents the importance of data and its interpretation in physics through familiarizing with state-of-the-art tools available in technology. The way we see data has changed over the centuries and yet the fundamentals of analysis remain more or less the same. Physics and science in general have always been a data driven field, from the early philosophers and astronomers looking up at the sky and tabulating the positions of planets and constellations. Millions of petabytes of scientific data are being produced every single day and most of it is made available to the public through data sets and libraries.

image

The DataSet

The DataSets used in this study serve as the raw material on which the work is done. It is from publicly available sources and in fact the creators of the data encourage the scientific as well as the non-scientific community to access, explore and apply their intuitions using Machine learning and statistical tools.
Sloan Digital Sky Survey

What is the Sloan Digital Sky Survey?

The survey will map one-quarter of the entire sky in detail, determining the positions and absolute brightness of hundreds of millions of celestial objects. It will also measure the distances to more than a million galaxies and quasars. The SDSS addresses fascinating, fundamental questions about the universe. With the survey, astronomers will be able to see the large-scale patterns of galaxies: sheets and voids through the whole universe. Scientists have many ideas about how the universe evolved, and different patterns of large-scale structure point to different theories. The Sloan Digital Sky Survey will tell us which theories are right - or whether we will have to come up with entirely new ideas.

image

Measuring Distance and Time: Redshift

The universe is expanding like a loaf of raisin bread rising in an oven. Pick any raisin, and imagine that it's our own Milky Way galaxy. If you place yourself on that raisin, then no matter how you look at the loaf, as the bread rises, all the other raisins move away from you. The farther away another raisin is from you, the faster it moves away. In the same way, all the other galaxies are moving away from ours as the universe expands. And because the universe is uniformly expanding, the farther a galaxy is from Earth, the faster it is receding from us. The light coming to us from these distant objects is shifted toward the red end of the electromagnetic spectrum, in much the same way the sound of a train whistle changes as a train leaves or approaches a station. The faster a distant object is moving, the more it is redshifted. Astronomers measure the amount of redshift in the spectrum of a galaxy to figure out how far away it is from us. By measuring the redshifts of a million galaxies, the Sloan Digital Sky Survey will provide a three-dimensional picture of our local neighborhood of the universe.

image

If you have any suggestions, feel free to contact the author

Thanks and Regards, Rita

About


Languages

Language:Jupyter Notebook 100.0%