Python basics - Getting help
- There are many places to turn to for a Python reference and basic help. The quickest way to get help on a function is to google
python what you're looking for. Typically, google will refer you to http://docs.python.org/. For example, try googling
python randomize. Google is good. Below are some additional references:
- 1. Search http://stackoverflow.com
- For example, try searching for
python randomizeon stackoverflow.
- For example, try searching for
- 2. http://software-carpentry.org/4_0/python/ Software Carpentry also has lectures and tutorials on Linux, Scientific Computing, and many other topics.
- 3. If you're having trouble conceptualizing how Python executes some bit of code, the Python visualizer can help.
- 4. Check out this page for a python crash course taught at UC Berkeley by my friend Lenny Teytelman
- 5. If you're a Matlab user transitioning to Python, this is the page for you: NumPy for Matlab users
- 6. There are some handy references in the
Psych711_commons\programmingreferences folder in Box.
- Lists and list comprehension: http://docs.python.org/tutorial/datastructures.html#more-on-lists
- Useful functions for Python dictionaries: http://docs.python.org/release/2.5.2/lib/typesmapping.html
- Writing/reading files: http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files
- Sorting lists and dictionaries - nice tips for how to sort by a particular key: http://wiki.python.org/moin/HowTo/Sorting
Python mini tutorials and tips
Get help on a module
- To get help on the functions contained in some module, for instance, the module 'string', type
Oo, look at that, learn something every time..:
import string string.letters >>> 'a' in string.letters True >>> 'D' in string.letters True >>> '3' in string.letters False
Notes on importing libraries and functions
Python provides a somewhat confusing variety of ways of importing functions and libraries.
import X import X as Y from X import * from X import a,b,c X = __import__('X')
The differences and pros and cons are discussed in this excellent article
- To find out the version of the library you've improted:
import nltk nltk.__version__
- To find out the location of the source files that are being loaded when you import a library:
import nltk nltk.__file__
Finding something in lists and strings
Supposed you have a list called shoppingList:
shoppingList = ['apples', 'oranges', 'screwdriver']
And you want to determine if this list contains some item, say, 'apples'. The easiest way to do it is to use 'in'
if 'apples' in shoppingList: print 'yep'
Now, suppose your shopping list is in a string called shopping list and you want to to determine if a string variable called shoppingList contains the word 'apples' in it.
shoppingString = 'apples, oranges, screwdriver'
Turns out 'in' works here as well:
if 'apples' in shoppingString: print 'yep'
The reason 'in' operator works here is that 'in' is defined for all sequences (lists, tuples, strings, etc.). Note, however, that in this case, there is an ambiguity. In the case of a shoppingList list, 'apples' is a standalone element. In the case of a shoppingList string, python doesn't know where one element starts and the next stops. Therefore, both of these statements will be true for shoppingString:
'apple' in shoppingString #true 'apples' in shoppingString #true
but not for shoppingList
'apple' in shoppingList #false 'apples' in shoppingList #true
Just as you can use 'in' to check if an element is contained in a sequence, you can use 'not in' to check if it's not in the sequence.
See the Python doc on exceptions here http://docs.python.org/tutorial/errors.html
The 'pythonic' way of doing things is to try it and catch the exception rather than check first.
For example, rather than doing this:
if os.path.exists('name.txt'): f = open('name.txt','r') else: print "file does not exist"
try: f = open('name.txt','r') except IOError: print "erm, file not found"
There are many cases where you have to use exceptions to keep your program from crashing, for example, division by 0.
Using list comprehension
[letter for letter in 'abracadabra']
is better than this
for letter in 'abcracadabra' print letter
Here's another example. Say you have a list of names and you want to split them into first and last names
names = ['Ed Sullivan', 'Salvador Dali'] firstNames = [name.split(' ') for name in names] lastNames = [name.split(' ') for name in names]
Another example: generate 100 random numbers in the range 1-5:
[random.randint(1,5) for i in range(100)]
Or generate 100 random letters:
[random.choice(list(string.ascii_lowercase)) for i in range(100)]
And yet another example, this one restricting the output using a conditional. Generate numbers from 0-7, but omitting 2 and 5:
[location for location in range(8) if location not in [2,5]]
def repetition(letters,numberBeforeSwitch,numRepetitions): print '\n'.join([item for sublist in [[i] * numberBeforeSwitch for i in letters] for item in sublist] * numRepetitions)
It is fast and compact, but certainly not very clear.
How to flatten a list
Say you've got a list like this:
But what you want is this:
You can turn list1 into list2 (i.e., flatten list1, like so:
list2 = [item for sublist in list1 for item in sublist]
The above method will only work for flattening lists of depth-1, see here for more information.
An alternative way of flattening a list is to use NumPy. Assuming we have a variable called list1,
import numpy list1 = numpy.array(list1) #convert it to a numpy array list1 = list1.flatten() #flatten it list1 = list(list1) #convert it back to a Python list, if you want. #we can, of course, do it all in one line: list1 = list(numpy.array(list1).flatten())
(In cases like this, you can continue to work with the NumPy Array, which lets you do all sorts of neat things).
- Don't reinvent the wheel. Operations like computing intersections, unions, and uniqueness are all well-defined functions in set notation and are built in to Python. See here. Some examples of sets:
Get the intersection (the elements in common)
>>> set('abc').intersection('cde') set(['c'])
Get the union (all the elements)
>>> set('abc').union('cdef') set(['a', 'c', 'b', 'e', 'd', 'f'])
- Note that because, by definition, a set can only contain unique elements, they are a good way to get all the distinct elements in a list.
>>> spam = ['s','s','s','p','p','a','m'] >>> set(spam) set(['a', 'p', 's', 'm'])
Caveat: sets are, by definition, not ordered, hence we are not guaranteed to get 's','p','a','m'
Let's see what spam and ham have in common
>>>set('spam').intersection('ham') set(['a', 'm'])
And what they don't
>>> set('spam').difference('ham') set(['p', 's'])
Arithmetic and floating point notation
- Python uses dynamic typing. This means that it attempts to automatically detect the type of variable you are creating.
spam = "can be fried"
Assigns the string
can be fried to the variable spam. It knows it's a string because it's in quotations
spam = 3
assigns spam to the integer 3, which is not the same as
spam2 = '3'
>>> spam==spam2 False
Because a bare numeral like 3 defaults to an integer, you can get unexpected behavior:
>>> spam/2 1
Which can be remedied by forcibly converting the variable to a floating point number.
>>> spam=3.0 >>> spam/2 1.5
>>> spam=3 >>> float(spam)/2 1.5
If you're not sure what type something is, use the
type() function to check.
Reference, mutability, and copying
- Have a look at this:
>>> egg = 'green' >>> ham = egg >>> ham 'green' >>> egg = 'yellow' >>> ham 'green'
Easy enough. Now have a look here:
>>> egg = ['green'] >>> ham=egg >>> ham ['green'] >>> egg = 'yellow' >>> ham ['yellow']
What do you think is happening here? That's right, ham points to the egg list, not to the content inside. When you change the content within egg, you've changed ham.
Writing to a file, safely
import(os) fileHandle = open('dataFile.txt','w') fileHandle.write(line) #the line you are writing. Use a Tab or comma-separated string if you're writing a CSV (or TSV) file. fileHandle.flush() #mostly unnecessary os.fsync(fileHandle) #ditto; it helps if you have several processes writing to the file
At the end of your experiment:
Copy a file
To copy a file use shutil.copyfile(src, dst). src is the path and name of the original file. dst is the path and name where src will be copied.
import shutil shutil.copyfile(src,dst)
This copies 1.dat into a new file named 3.dat.
This copies 1.dat into the specified directory as 3.dat. Notice the escape character before the slash.
Create a new directory
import os os.makedirs(newDirectoryName)
Some simple generator functions
Here's a function that implements an infinite list of odd numbers.
def oddNum(start): while True: if start % 2 ==0: start+=1 yield start start+=1
Here's one way to use it:
Get 30 odd numbers starting at 1
someOddNums = oddNum(1) #start it at 1 for i in range (30): print someOddNums.next()
Here's another way using list comprehension:
moreOddNums = oddNum(1) #start it at 1 [moreOddNums.next() for i in range(30)]
Here's a generator function for implementing a circular list. If you pass in a number, it will create a list of integers of that length, i.e., circularList(5) will create a circular list from [0,1,2,3,4]. If you pass in a list, it will make a circular list out of what you pass in, e.g., circularList(['a','b','c']) will create a circular list from ['a','b','c'])
def circularList(lst): if not isinstance(lst,list) and isinstance(lst,int): lst = range(lst) i = 0 while True: yield lst[i] i = (i + 1)%len(lst) #try this out to understand the logic
To use it, create a new generator by assigning it to a variable:
myGenerator = circularList(lst)
where lst is the list you'd like to iterate through continuously. Notice the conditional in the first line of the circularList function. This allows the function to take in either a list or an integer. In the latter case, the function constructs a new list of that length, e.g., circularList(3) will iterate through the list [0,1,2] ad infinitum:
myGenerator = circularList([0,1,2]) myGenerator.next() >>>0 myGenerator.next() >>>1 myGenerator.next() >>>2 myGenerator.next() >>>0 ... See what happens if you make a generator using a character string, e.g., myGenerator = circularList('spam').
Here's a slightly more complex version of the circularList generator. The basic version above iterates through the list always in the same order. It is more likely that you'll want to iterate through it in a less ordered way. The variant below shuffles the list after each complete passthrough. Moreover, the shuffling is controlled by a seed so that each time you run it with the same seed, you'll get the same sequence of randomizations.
def randomizingCircularList(lst,seed): if not isinstance(lst,list): lst = range(lst) i = 0 random.seed(seed) while True: yield lst[i] if (i+1) % len(lst) ==0: random.shuffle(lst) i = (i + 1)%len(lst) >>>newCircle=randomizingCircularList(['a','b','c'],10) >>>for i in range(10): >>> print newCircle.next() a b c c a b b c a b
Here is a simple counter class:
class Counter: """A simple counting class""" def __init__(self): """Initialize a counter to zero.""" self.count= 0 def __str__(self): """Return a string of the count.""" return str(self.count) def increment(self, amount): """Increment the counter.""" self.count+= amount def reset(self): """Reset the counter to zero.""" self.count= 0
Here's another simple class:
class BankAccount(): def __init__(self, initial_balance=0): self.balance = initial_balance def deposit(self, amount): self.balance += amount def withdraw(self, amount): self.balance -= amount def overdrawn(self): return self.balance < 0
Creating an instance of a BankAccount class and manipulatig the balance is as simple as:
my_account = BankAccount(15) my_account.withdraw(5) print my_account.balance