ClusterPluck

Contents:

  • Installation
  • Tutorials
    • Basics
      • How to conduct a simple name search
ClusterPluck
  • »
  • Tutorials »
  • How to conduct a simple name search
  • View page source

How to conduct a simple name search

After installing the clusterpluck package, this tutorial demonstrates how to search for an open cluster using just its name.

First, import the module and classes.

[1]:
import clusterpluck as cp
from clusterpluck.gaia import Refine, Info, Plotting
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline

Then, perform the search, downloading the basic cluster data from the SIMBAD Astronomical Database and the individual star data from the Gaia archive. The default search is processed using Gaia DR2.

The name of the cluster must be in a string and must also be in a recognised format, i.e. M.. for Messier catalogue, NGC.. for New General Catalogue, etc. It is possible that lesser known catalogues can be used but they may cause an error if the format doesn’t match SIMBAD’s.

The results will be downloaded and stored as a CSV file.

[2]:
cp.search_name('M47')
Number of stars: 992
RA: 07 36 35 Dec: -14 29.3 Rad: 0.5
PM_RA: -7.02 PM_Dec: 0.9592 PM_Rad: 2
Distance range: 242 pc to 725 pc

Messier 47 is an open cluster in Cancer. The output of the search contains the following information:

  • The number of stars downloaded from the Gaia database.

  • The right ascention (RA), declination (Dec) and search radius of the cluster. This is its position in the sky and it’s size.

  • The proper motion RA, Dec and proper motion search radius. This is the rate of cluster’s apparent movement across the sky. Cluster stars will all share approximately the same apparent movement and so will form a tight group when these data are plotted.

  • Finally, the distance range of the search. This helps filter out lots of stars that are unrelated but can also cause cluster stars to be lost. In particular any objects further than 1 kpc (1000 pc) away can suffer from this.

Any or all of these can be amended by using the general search() method but that is for another tutorial.

Now let’s use the load() method to load the data into a Pandas dataframe.

[3]:
t = cp.load()

We can check the contents of the dataframe using simple pandas commands.

[4]:
t.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 992 entries, 0 to 991
Data columns (total 20 columns):
 #   Column           Non-Null Count  Dtype
---  ------           --------------  -----
 0   name             992 non-null    object
 1   otypes           992 non-null    object
 2   parallax         992 non-null    float64
 3   parallax_error   992 non-null    float64
 4   pmra             992 non-null    float64
 5   pmra_error       992 non-null    float64
 6   pmdec            992 non-null    float64
 7   pmdec_error      992 non-null    float64
 8   bp_rp            992 non-null    float64
 9   phot_g_mean_mag  992 non-null    float64
 10  ra               992 non-null    float64
 11  dec              992 non-null    float64
 12  distance         992 non-null    float64
 13  distance_error   992 non-null    float64
 14  m_v_tycho        992 non-null    float64
 15  b_v              992 non-null    float64
 16  abs              992 non-null    float64
 17  t_k              992 non-null    float64
 18  lum_s            992 non-null    float64
 19  prob             992 non-null    float64
dtypes: float64(18), object(2)
memory usage: 155.1+ KB

As you can see the load() method takes some time to complete. That’s because it’s doing a lot of work. The parallax is converted into a distance in parsecs, the gaia g magnitude and colour index is converted into a more standardized Tycho magnitude and b - v colour index. From these the approximate stellar absolute magnitude, effective temperature and luminosity is calculated.

Then a simple cluster probability algorithm is run to help classify stars by giving them a percentage style ‘rating’ based on their proximity to the centeroids of the position, distance and proper motion of the cluster.

Finally, a matching algorithm runs through all the objects and compares their RA and DEC to a list of objects in the SIMBAD catalogue. If these match the name of the object and its object type is added to the dataframe and saved in CSV format.

The dataframe is in descending g (green) magnitude order.

[5]:
t
[5]:
name otypes parallax parallax_error pmra pmra_error pmdec pmdec_error bp_rp phot_g_mean_mag ra dec distance distance_error m_v_tycho b_v abs t_k lum_s prob
0 HD 60855 **|*|Em*|Be*|V*|IR|UV|X 2.077637 0.207615 -7.109151 0.155182 0.780030 0.133099 -0.118344 5.627999 114.016190 -14.492770 481.315959 43.727498 5.641895 -0.150834 -2.770256 11890.825947 1182.078739 79.900015
1 HD 61224 **|*|Em*|Be*|IR|UV 2.012069 0.106906 -7.138949 0.080708 1.145077 0.083712 0.081778 6.473912 114.411640 -14.440757 497.000887 25.074589 6.499310 -0.011235 -1.982476 9207.670048 572.182144 38.532447
2 HD 61017 X|**|*|IR 2.021288 0.100697 -7.310861 0.086194 0.785335 0.074871 -0.002212 6.655011 114.171852 -14.443610 494.734135 23.477293 6.673285 -0.069918 -1.798574 10152.913748 483.029546 80.101179
3 HD 61114 *|IR 1.743823 0.067370 -7.792587 0.054587 2.608254 0.045973 2.395475 6.861200 114.285469 -14.324596 573.452773 21.330566 8.088121 1.639623 -0.704367 2807.316771 176.315618 0.522240
4 HD 60998 *|UV|**|IR 1.865338 0.091669 -7.078137 0.070441 1.084258 0.063201 -0.031709 6.872384 114.150463 -14.484610 536.095954 25.111499 6.888938 -0.090495 -1.757274 10538.810106 465.000918 69.543404
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
987 2.101019 2.697285 -5.451604 1.943831 -0.055180 2.318828 1.512276 20.853565 114.108059 -14.141460 475.959477 267.552516 21.434158 1.004453 13.046308 3775.135396 0.000557 0.759856
988 3.006464 3.076210 -5.971694 2.067749 -0.149494 2.225318 1.086000 20.883945 114.298768 -14.845272 332.616656 168.215278 21.223415 0.699234 13.613695 4545.307747 0.000330 0.329157
989 3.919975 4.900724 -5.411766 3.079569 -0.184064 4.386642 1.416664 20.885458 114.524786 -14.436349 255.103698 141.733991 21.407289 0.935868 14.373705 3923.316770 0.000164 0.007395
990 4.023713 3.740082 -6.316390 3.189567 1.229143 2.255890 1.818579 20.899426 113.957239 -14.413898 248.526653 119.723675 21.684544 1.224519 14.707679 3369.758463 0.000121 0.539806
991 1.840975 3.254297 -5.187855 2.186801 2.466972 2.374998 3.004002 20.947468 114.296388 -14.796842 543.190356 346.929973 22.697336 2.076619 14.022576 2390.005509 0.000227 0.045637

992 rows × 20 columns

Once the data are loaded to a variable we can check to see if we have a cluster. Using the Refine class, let’s see how the proper motion plot, pm_plot(), looks.

[6]:
Refine.pm_plot(t)
../_images/notebooks_1_NameSearch_11_0.png

We can see the cluster’s stars are forming a group right in the middle of the plot. That means SIMBAD has given us good proper motion data and the default proper motion radius is acceptable. Any closer and we would lose relevent star data. Further away and there would be mnore chance of unrelated stars being included.

Now let’s look at a plot of the cluster as a star map, map().

[7]:
Refine.map(t)
../_images/notebooks_1_NameSearch_13_0.png

This plot shows a star map of the search with the star size proportional to their magnitude. It can help show us if our search radius is too wide or narrow. This looks pretty good as the cluster appears fully contained.

Now let’s see if the distance filter has correct values using the d_hist(). This produces a simple histogram of the stellar distances.

[8]:
Refine.d_hist(t)
../_images/notebooks_1_NameSearch_15_0.png

There is a very clear, tall peak in the middle of our graph that tells us the cluster stars are clearly outnumbering the unrelated stars. We can also see roughly how far away the cluster is in parsecs just by looking. The distance filter doesn’t need refining either. So that is all the parameters involved with the search.

However let’s have a look at two of the features we can draw from this data; a colour magnitude diagram and a more precise measurement of the distance.

Using the Plotting class, we can call the cmd() instance which uses values of g magnitude and the gaia bp-rp colour index…

[10]:
Plotting.cmd(t)
../_images/notebooks_1_NameSearch_17_0.png

… and we have a beautiful CMD plot with the classic main sequence of stars running from top left to bottom right. These stars are in the middle of their lives, burning Helium in their cores in a relatively stable way just like our sun. The stars above the main sequence are mainly multiple star systems that have a slightly higher luminosity.

Other features are the brightest stars at the top which appear to just be ‘curling’ upwards. This is called the main sequence turn off. The stars here are running low on core Helium and starting to evolve into red giants. They’re not quite at that point but the position of the turn off is a major method of ageing clusters. At the other end are the red and white dwarfs.

Finally, use the Info class dist() instance to extract a calculated distance from the parallax data including a 2-sigma range.

[11]:
Info.dist(t)
Distance: 474 pc
5%: 341 pc - 95%: 632
Previous

© Copyright 2022, Marc Canale.

Built with Sphinx using a theme provided by Read the Docs.