A Brief Introduction to Python

This page is intended to serve as an outline for the python REU discussion on Thursday, June 9, 2016, and as a useful reference for folks trying to learn python.


Resources

There are lots of resources out there to help one learn python.

Most of the code I write I don't go digging through tutorials to remember snippets of how to do things instead, I use google, old code, and some good reference websites as detailed below:

Many (older) astronomers are used to using idl. There are lots of good resources for idl users who want to switch to python:


Some example code

Michael will talk through some example code, which can be downloaded here (if you prefer an ipython notebook) or here (if you'd prefer just a normal python script). You can download the data used in the example here. Note that there is also a nice html page with the same information, along with both a color or large print pdf.

For those very new to python, one of the most important things to understanding python syntax is that python is an object oriented programming language. This means that python focuses around "objects," like say a list of numbers:

>>> mylist = [1,2,3,4]
mylist is a list object, which has certain properties and can be manipulated in certain ways. These manipulations, called methods, are properties of the list object, and thus are accessed by calling object.method(). So, for example, if you wanted to add the value 5 to the list, you do:
>>> mylist.append(5)
[1,2,3,4,5]
In addition to methods of objects, you also can have functions that operate on objects. An example of this is if I want to know the length of mylist I can use the length function:
>>>len(mylist)
5

With this in mind, have a go at Michael's example and the example exercises below. Good luck!


Demonstration exercises

Here are some sample exercises to work through. They demonstrate many techniques that we use all the time.

Beginner Level

This exercise is designed for those who are fairly new to python and coding in general. It asks you to read in a list of numbers from a file and to write an algorithm to sort the list.

  1. Using the techniques described in the example code above, read in this file and store its contents as a list (f.readlines() will be useful). Print your list.
  2. The list you've read in will be a list of strings. Write a for loop that converts each string in the list to an integer (using range(len(list))...). Print your updated list.
  3. Next, create a second, empty list to store the sorted data in.
  4. Now write a for loop that loops over the list you read in from file and:
  5. Congratulations, you've now found the lowest value in the list. Take the value stored in your for loop and add it to your second list (using the list.append() method). Use the list.remove(x) method to remove the value you've just added to the second list from the first list.
  6. Now repeat the process in steps 4 and 5 for each value in the initial list (do this by embedding steps 4 and 5 in a for loop; the syntax range(len(list)) will be useful here). [Note, you also could use a while statement, but we'll stick with for loops].
  7. Print out your newly sorted list to make sure your algorithm worked.
  8. If time permits, add a variable verbose, that when it's true you print out the list at each step of the way.
  9. If time permits, come up with a more efficient method for sorting the list (there are many: it's fine to use google to see what sorting algorithms are out there. And of course, there's a python sort command - see if you can figure out how it works).

Once you have finished the exercise you can compare your answers to my code here.

Intermediate Level

This exercise is designed for those who are already somewhat comfortable with python and want to learn more about exploiting its capabilities. It asks you to read in a file containing 10 time series, each containing a gaussian radio pulse. Then, using numpy and matplotlib, it asks you to plot the pulse, measure the pulse's signal to noise ratio, and output values in a nicely formatted table.

  1. Read in this file.
  2. The file contains 10 rows of comma separated numbers. Each row represents the amount of signal output from a radio antenna as a function of time (in 1 second time intervals). Loop through the lines in the file (f.readlines() will be useful here). For each line, do the following:
    1. Convert the line from one long string into a numpy array of floats.
    2. Using matplotlib.pyplot make a plot of the data you just read in as a function of time (hint: you'll have to figure out how many time steps are present in the data).
    3. Using the capabilities of numpy, find the value of the maximum flux in your time series.
    4. Excluding your pulse, (the pulse is in the first half of the time series, so you can cheat and just limit yourself to the second half of the time series) calculate the rms noise in your spectrum. (Recall that the rms is the root mean square - find the mean of the squares of all the points, then take the square root. You might also use np.std() and compare the results (and think about why they are different, if they are different)).
    5. Do a simple estimate of the signal to noise ratio of the pulse as peakflux/rms.
    6. Using a formatted string, print the output signal to noise, peakflux and rms to a descriptive table, rounding each number to two decimal places.
  3. If time permits figure out how to display all your time series on top of one another at the end, rather than having the plots pop up one at a time.
  4. If time permits mess around with fitting the gaussian pulse and come up with other estimates of the signal to noise ratio.

Once you have finished the exercise you can compare your answers to my code here.

Advanced Level

This exercise is for those who really know what they are doing in python and would like a challenge. Given a low signal to noise pulse series you're asked to determine the pulse frequency, and then pull the pulse out of the noise by smoothing the time series and folding the data.

  1. Read in the data in this file using whatever method you'd like.
  2. Using the capabilities of scipy, take the fourier transform of the data and plot to determine the pulse frequency.
  3. Smooth the spectrum using 3 channel hanning smoothing.
  4. Fold the data to the period you determined earlier, and plot.
  5. iterate as necessary.

As an alternative exercise, use aplpy ("apple pie") to make an image of a field of ALFALFA data.

  1. Download this fits file, and plot it in inverted greyscale.
  2. Overplot a contour at 0.13 mJy/beam.
  3. There are two groups of galaxies in the image. Put a box around each one.
  4. Label the lower left group NGC 3227 group, and the upper right group the NGC 3190 group
  5. Make your axis labels bold, and give the figure a thick border
  6. Save a .png and .eps version of the figure

Compare to my code to make a figure from one of my recent papers!


This page was created by Luke for anyone to reference, specifically those in the Cornell astronomy community. Please contact him with questions or comments.
Last updated by Luke on June 9, 2016 at 1:38pm.