A Brief Introduction to Python
This page is intended to serve as an outline for the python REU discussion on
Thursday, June 9, 2016, and as a useful reference for folks trying to learn python.
Resources
There are lots of resources out there to help one learn python.
Most of the code I write I don't go digging through tutorials to remember snippets of how
to do things instead, I use google, old code, and some good reference websites as detailed below:
Many (older) astronomers are used to using idl. There are lots of good resources for idl users
who want to switch to python:
Some example code
Michael will talk through some example code, which can be downloaded here (if you prefer an ipython notebook) or here (if you'd prefer just a normal python script). You can download the data used in the example here. Note that there is also a nice html page with the same information, along with both a color or large print pdf.
For those very new to python, one of the most important things to
understanding python syntax is that python is an object oriented programming
language. This means that python focuses around "objects," like say a list of
numbers:
>>> mylist = [1,2,3,4]
mylist is a list object, which has certain properties and can be manipulated in certain ways.
These manipulations, called methods, are properties of the list object, and thus are
accessed by calling object.method(). So, for example, if you wanted to add the value 5 to the list, you
do:
>>> mylist.append(5)
[1,2,3,4,5]
In addition to methods of objects, you also can have functions that operate on objects. An example of this is if I want to know the length of mylist I can use the length
function:
>>>len(mylist)
5
With this in mind, have a go at Michael's example and the example exercises below. Good luck!
Demonstration exercises
Here are some sample exercises to work through. They demonstrate many techniques that
we use all the time.
Beginner Level
This exercise is designed for those who are fairly new to python and coding
in general. It asks you to read in a list of numbers from a file and to write an
algorithm to sort the list.
- Using the techniques described in the example code above, read in this file
and store its contents as a list (f.readlines() will be useful). Print your list.
- The list you've read in will be a list of strings. Write a for loop that converts
each string in the list to an integer (using range(len(list))...). Print your updated list.
- Next, create a second, empty list to store the sorted data in.
- Now write a for loop that loops over the list you read in from file and:
- stores the first entry
- looks at each sucessive entry in the list and compares it to the
stored entry.
- If an entry is less than the stored entry, replace the stored entry
with this new lowest value.
- Congratulations, you've now found the lowest value in the list. Take the value stored in your for loop and add it to your second list (using the list.append() method). Use the list.remove(x) method to remove the value you've just added to the second list from the first list.
- Now repeat the process in steps 4 and 5 for each value in the initial list (do this by embedding steps 4 and 5 in a for loop; the syntax range(len(list)) will be useful here). [Note, you also could use a while statement, but we'll stick with for loops].
- Print out your newly sorted list to make sure your algorithm worked.
- If time permits, add a variable verbose, that when it's true you print out the list at each
step of the way.
- If time permits, come up with a more efficient method for sorting the list
(there are many: it's fine to use google to see what sorting algorithms are out
there. And of course, there's a python sort command - see if you can figure out how it works).
Once you have finished the exercise you can compare your answers
to my code here.
Intermediate Level
This exercise is designed for those who are already somewhat comfortable with
python and want to learn more about exploiting its capabilities. It asks you to
read in a file containing 10 time series, each containing a gaussian radio pulse.
Then, using numpy and matplotlib, it asks you to plot the pulse, measure the pulse's signal to noise ratio, and output values in a nicely formatted table.
- Read in this file.
- The file contains 10 rows of comma separated numbers. Each row represents the amount
of signal output from a radio antenna as a function of time (in 1 second time intervals).
Loop through the lines in the file (f.readlines() will be useful here). For each line, do
the following:
- Convert the line from one long string into a numpy array of floats.
- Using matplotlib.pyplot make a plot of the data you just read in
as a function of time (hint: you'll have to figure out how many
time steps are present in the data).
- Using the capabilities of numpy, find the value of the maximum flux in your
time series.
- Excluding your pulse, (the pulse is in the first half of the time series,
so you can cheat and just limit yourself to the second half of the time series)
calculate the rms noise in your spectrum. (Recall that the rms is the root mean
square - find the mean of the squares of all the points, then take the square root.
You might also use np.std() and compare the results (and think about why they are
different, if they are different)).
- Do a simple estimate of the signal to noise ratio of the pulse as peakflux/rms.
- Using a formatted string, print the output signal to noise, peakflux and rms to
a descriptive table, rounding each number to two decimal places.
- If time permits figure out how to display all your time series on top of one another
at the end, rather than having the plots pop up one at a time.
- If time permits mess around with fitting the gaussian pulse and come up with other estimates
of the signal to noise ratio.
Once you have finished the exercise you can compare your answers
to my code here.
Advanced Level
This exercise is for those who really know what they are doing in python and would
like a challenge. Given a low signal to noise pulse series you're asked to determine the pulse frequency, and then pull the pulse out of the noise by smoothing the time series and folding the data.
- Read in the data in this file using whatever method you'd like.
- Using the capabilities of scipy, take the fourier transform of the data and plot to
determine the pulse frequency.
- Smooth the spectrum using 3 channel hanning smoothing.
- Fold the data to the period you determined earlier, and plot.
- iterate as necessary.
As an alternative exercise, use aplpy ("apple pie") to make an image of a field
of ALFALFA data.
- Download this fits file, and plot it in inverted greyscale.
- Overplot a contour at 0.13 mJy/beam.
- There are two groups of galaxies in the image. Put a box around each one.
- Label the lower left group NGC 3227 group, and the upper right group the NGC 3190 group
- Make your axis labels bold, and give the figure a thick border
- Save a .png and .eps version of the figure
Compare to my code to make a figure from one of my recent papers!
This page was created by Luke for anyone to reference, specifically those in the
Cornell astronomy community. Please contact him with questions or comments.
Last updated by Luke on June 9, 2016 at 1:38pm.