Monday July 12, 2010
2009 Bureau of Labor Statistics Salary Data mysqldump file
The Bureau of Labor Statistics’ (BLS’s) 2009 Salaray Data is now online, and it’s pretty interesting.
I’ve turned their three big Metro Area Exel files (available on the downloads page) into a mysqldump file to make for easy querying.
Note: this file is an 8MB gzip, so rather than open in your browser, I strongly suggest right- (or on a Mac, control-) click this link to save it as a file: bls_salaries.sql.gz
Once you've expand the file and imported it into a database ($ gunzip bls_salaries.sql.gz; mysql -p db_name < bls_salaries.sql) you can start playing with the data.
Here's the key to the columns:
- id - an id column
- state - the two letter US state
- area - the code for the metro area
- area_name - the description of the metro area
- occ_code - the code for the occupation category
- occ_title - the description of the occupation category
- group - used to denote summary occupation categories
- total_emp - the estimated total employment for the category
- emp_prse - the percent relative standard error for the employment category
- jobs_1000 - the number of jobs in the employment category per 1,000 jobs in the metro area
- h_mean - the mean hourly wage
- a_mean - the mean annual wage
- mean_prse - the percent relative standard error for the mean wage
And the a_ (or h_) pctXX and median fields are the annual (a_) and hourly (h_) percentile wages, e.g. a_pct10 is the 10th percentile wage, and h_median is the hourly median (50th percentile) wage.
Here are some sample queries to get you started:
mysql> select count(distinct occ_code) from MSA;
+--------------------------+
| count(distinct occ_code) |
+--------------------------+
| 816 |
+--------------------------+
1 row in set (0.32 sec)
mysql> select distinct occ_code, occ_title from MSA order by occ_title limit 10;
+----------+------------------------------------------------------------------+
| occ_code | occ_title |
+----------+------------------------------------------------------------------+
| 13-2011 | Accountants and auditors |
| 27-2011 | Actors |
| 15-2011 | Actuaries |
| 23-1021 | Administrative law judges, adjudicators, and hearing officers |
| 11-3011 | Administrative services managers |
| 25-3011 | Adult literacy, remedial education, and GED teachers and instruc |
| 11-2011 | Advertising and promotions managers |
| 41-3011 | Advertising sales agents |
| 17-3021 | Aerospace engineering and operations technicians |
| 17-2011 | Aerospace engineers |
+----------+------------------------------------------------------------------+
10 rows in set (0.80 sec)
mysql> select distinct occ_code, occ_title from MSA where occ_title like "%oftware%" order by occ_title limit 10;
+----------+-----------------------------------------------+
| occ_code | occ_title |
+----------+-----------------------------------------------+
| 15-1031 | Computer software engineers, applications |
| 15-1032 | Computer software engineers, systems software |
+----------+-----------------------------------------------+
2 rows in set (0.29 sec)
mysql> select count(distinct area) from MSA;
+----------------------+
| count(distinct area) |
+----------------------+
| 400 |
+----------------------+
1 row in set (0.21 sec)
mysql> select distinct area, area_name from MSA where state="CA";
+-------+----------------------------------------------------------------+
| area | area_name |
+-------+----------------------------------------------------------------+
| 12540 | Bakersfield, CA |
| 17020 | Chico, CA |
| 20940 | El Centro, CA |
| 23420 | Fresno, CA |
| 25260 | Hanford-Corcoran, CA |
| 31084 | Los Angeles-Long Beach-Glendale, CA Metropolitan Division |
| 31460 | Madera, CA |
| 32900 | Merced, CA |
| 33700 | Modesto, CA |
| 34900 | Napa, CA |
| 36084 | Oakland-Fremont-Hayward, CA Metropolitan Division |
| 37100 | Oxnard-Thousand Oaks-Ventura, CA |
| 39820 | Redding, CA |
| 40140 | Riverside-San Bernardino-Ontario, CA |
| 40900 | Sacramento--Arden-Arcade--Roseville, CA |
| 41500 | Salinas, CA |
| 41740 | San Diego-Carlsbad-San Marcos, CA |
| 41884 | San Francisco-San Mateo-Redwood City, CA Metropolitan Division |
| 41940 | San Jose-Sunnyvale-Santa Clara, CA |
| 42020 | San Luis Obispo-Paso Robles, CA |
| 42044 | Santa Ana-Anaheim-Irvine, CA Metropolitan Division |
| 42060 | Santa Barbara-Santa Maria-Goleta, CA |
| 42100 | Santa Cruz-Watsonville, CA |
| 42220 | Santa Rosa-Petaluma, CA |
| 44700 | Stockton, CA |
| 46700 | Vallejo-Fairfield, CA |
| 47300 | Visalia-Porterville, CA |
| 49700 | Yuba City, CA |
+-------+----------------------------------------------------------------+
28 rows in set (0.31 sec)
mysql> select count(distinct state) from MSA;
+-----------------------+
| count(distinct state) |
+-----------------------+
| 52 |
+-----------------------+
1 row in set (0.30 sec)
mysql> select area_name, occ_title, a_pct10, a_median, a_pct90 from MSA where area in (36084, 41884, 41940)
-> and occ_code in ("15-1031", "15-1032") order by a_median DESC\G
*************************** 1. row ***************************
area_name: San Jose-Sunnyvale-Santa Clara, CA
occ_title: Computer software engineers, systems software
a_pct10: 77940
a_median: 124830
a_pct90: 0
*************************** 2. row ***************************
area_name: San Jose-Sunnyvale-Santa Clara, CA
occ_title: Computer software engineers, applications
a_pct10: 72900
a_median: 116260
a_pct90: 156040
*************************** 3. row ***************************
area_name: San Francisco-San Mateo-Redwood City, CA Metropolitan Division
occ_title: Computer software engineers, systems software
a_pct10: 73780
a_median: 111450
a_pct90: 159720
*************************** 4. row ***************************
area_name: San Francisco-San Mateo-Redwood City, CA Metropolitan Division
occ_title: Computer software engineers, applications
a_pct10: 68220
a_median: 105560
a_pct90: 147860
*************************** 5. row ***************************
area_name: Oakland-Fremont-Hayward, CA Metropolitan Division
occ_title: Computer software engineers, applications
a_pct10: 66120
a_median: 100470
a_pct90: 157890
*************************** 6. row ***************************
area_name: Oakland-Fremont-Hayward, CA Metropolitan Division
occ_title: Computer software engineers, systems software
a_pct10: 60240
a_median: 94540
a_pct90: 135730
6 rows in set (0.01 sec)
