This introduction to pandas is derived from Data School's pandas Q&A with my own notes and code.

Randomly sample rows from a DataFrame¶

import pandas as pd

link = 'http://bit.ly/uforeports'
ufo = pd.read_csv(link)

ufo.head()

# to get 3 random rows
# each time you run this, you would have 3 different rows
ufo.sample(n=3)

# you can use random_state for reproducibility
ufo.sample(n=3, random_state=2)

# fraction of rows
# here you get 75% of the rows
ufo.sample(frac=0.75, random_state=99)

For machine learning train-test split

You need non-overlapping rows in your train and test sets

train = ufo.sample(frac=0.75, random_state=99)

# you can't simply split 0.75 and 0.25 without overlapping
# this code tries to find that train = 75% and test = 25%
test = ufo.loc[~ufo.index.isin(train.index), :]

	City	Colors Reported	Shape Reported	State	Time
0	Ithaca	NaN	TRIANGLE	NY	6/1/1930 22:00
1	Willingboro	NaN	OTHER	NJ	6/30/1930 20:00
2	Holyoke	NaN	OVAL	CO	2/15/1931 14:00
3	Abilene	NaN	DISK	KS	6/1/1931 13:00
4	New York Worlds Fair	NaN	LIGHT	NY	4/18/1933 19:00

	City	Colors Reported	Shape Reported	State	Time
13615	Hillsboro	NaN	TRIANGLE	OR	6/3/1999 0:44
140	East Palestine	NaN	LIGHT	OH	7/10/1950 20:30
15412	Ceder Lake	RED	LIGHT	IN	12/1/1999 6:00

	City	Colors Reported	Shape Reported	State	Time
7236	Mesquite	NaN	OTHER	NV	11/25/1993 15:00
14432	Pittsburg	NaN	OTHER	CA	9/4/1999 0:38
4559	Mondel	NaN	NaN	NM	6/18/1981 3:00

	City	Colors Reported	Shape Reported	State	Time
6250	Sunnyvale	NaN	OTHER	CA	12/16/1989 0:00
8656	Corpus Christi	NaN	NaN	TX	9/13/1995 0:10
2729	Mentor	NaN	DISK	OH	8/8/1974 10:00
7348	Wilson	NaN	LIGHT	WI	6/1/1994 1:00
12637	Lowell	NaN	CIRCLE	MA	11/26/1998 10:00
2094	Victorville	NaN	LIGHT	CA	6/6/1971 21:00
15905	Black Canyon City	BLUE	CIRCLE	AZ	2/16/2000 4:45
6792	Houston	NaN	CHEVRON	TX	6/10/1992 23:00
5063	Ely	NaN	DIAMOND	MN	6/15/1984 19:00
16626	Atlantic Ocean	NaN	NaN	NC	6/17/2000 0:35
17030	Portland	RED ORANGE	LIGHT	OR	7/27/2000 3:35
2391	Larchwood	NaN	DIAMOND	IA	6/6/1973 22:00
12210	Castaic	GREEN BLUE	CIGAR	CA	9/23/1998 22:45
11447	Friday Harbor	NaN	FIREBALL	WA	4/22/1998 21:20
14849	Breckenridge	NaN	OTHER	TX	10/15/1999 20:00
11056	Salida	NaN	LIGHT	CA	12/15/1997 20:00
16877	Redland	NaN	CIRCLE	OR	7/10/2000 20:30
9707	Seattle	NaN	SPHERE	WA	11/7/1996 23:30
9811	Bartlett	NaN	OTHER	TN	12/15/1996 23:00
16516	Chattanooga	NaN	DISK	TN	6/1/2000 17:00
3587	Albuquerque	NaN	OVAL	NM	7/24/1977 7:00
9288	Tampa	NaN	NaN	FL	5/6/1996 22:00
16360	Hwy 12	NaN	LIGHT	WA	5/1/2000 20:00
4718	Overland Park	NaN	CIGAR	KS	6/15/1982 14:00
9825	Woodville	NaN	TRIANGLE	TX	12/17/1996 20:17
3843	Spokane	NaN	DISK	WA	7/14/1978 22:30
17525	Jordan	BLUE	CIGAR	MN	9/26/2000 13:00
7973	Fort Wayne	NaN	NaN	IN	3/30/1995 0:00
16900	Casa	GREEN	CIGAR	AR	7/12/2000 23:00
14709	Oak Brook	GREEN	LIGHT	IL	10/1/1999 1:00
...	...	...	...	...	...
1353	Opa Locka	NaN	CHEVRON	FL	1/1/1967 13:00
3754	Paterson	NaN	DISK	NJ	6/1/1978 21:00
11495	Moultrie	NaN	VARIOUS	GA	5/4/1998 6:00
10053	New York City	NaN	OVAL	NY	3/14/1997 23:59
12868	Portland	ORANGE	SPHERE	OR	1/8/1999 18:00
11491	Portland	NaN	TRIANGLE	OR	5/2/1998 22:45
715	Aurora	NaN	LIGHT	CO	6/1/1962 20:00
18122	San Diego	NaN	LIGHT	CA	12/15/2000 4:05
358	Akron	NaN	OTHER	OH	6/6/1956 22:00
1552	Wheaton	RED	CIRCLE	MD	1/1/1968 23:00
3453	Long Green	NaN	DISK	MD	4/17/1977 16:30
16514	Albuquerque	NaN	LIGHT	NM	6/1/2000 15:00
14848	Newbern	NaN	LIGHT	TN	10/15/1999 19:20
11414	Catawba	NaN	TRIANGLE	OH	4/15/1998 7:45
11330	Okoboji	YELLOW GREEN	FIREBALL	IA	3/20/1998 18:00
10724	Pinckney	NaN	LIGHT	MI	8/20/1997 23:00
7773	Shawnee	RED	NaN	OK	2/7/1995 21:10
8988	Schnecksville	BLUE	NaN	PA	12/20/1995 23:50
5306	San Carlos	NaN	LIGHT	CA	7/31/1985 22:30
8731	San Francisco	NaN	NaN	CA	9/29/1995 12:50
7254	Fort Lauderdale	NaN	NaN	FL	1/1/1994 3:00
3622	Black River Falls	NaN	LIGHT	WI	8/18/1977 19:30
8241	Ann Arbor	NaN	NaN	MI	6/14/1995 1:35
13133	Fresno	NaN	CIGAR	CA	3/4/1999 7:15
7598	Spring Valley	NaN	LIGHT	CA	10/31/1994 18:00
8965	Lynnwood	NaN	NaN	WA	12/6/1995 22:45
4991	Kent	NaN	NaN	WA	12/5/1983 5:00
2740	Niagara Falls	NaN	TRIANGLE	NY	8/15/1974 20:00
11887	Vancouver	NaN	TRIANGLE	WA	7/25/1998 21:00
9809	Issaquah	NaN	NaN	WA	12/14/1996 20:20

Randomly Sample Rows

Randomly sample rows from a DataFrame¶