Monday July 12, 2010

2009 Bureau of Labor Statistics Salary Data mysqldump file

The Bureau of Labor Statistics’ (BLS’s) 2009 Salaray Data is now online, and it’s pretty interesting.

I’ve turned their three big Metro Area Exel files (available on the downloads page) into a mysqldump file to make for easy querying.

Note: this file is an 8MB gzip, so rather than open in your browser, I strongly suggest right- (or on a Mac, control-) click this link to save it as a file: bls_salaries.sql.gz

Once you've expand the file and imported it into a database ($ gunzip bls_salaries.sql.gz; mysql -p db_name < bls_salaries.sql) you can start playing with the data.

Here's the key to the columns:

And the a_ (or h_) pctXX and median fields are the annual (a_) and hourly (h_) percentile wages, e.g. a_pct10 is the 10th percentile wage, and h_median is the hourly median (50th percentile) wage.

Here are some sample queries to get you started:

mysql> select count(distinct occ_code) from MSA;
+--------------------------+
| count(distinct occ_code) |
+--------------------------+
|                      816 |
+--------------------------+
1 row in set (0.32 sec)


mysql> select distinct occ_code, occ_title from MSA order by occ_title limit 10;
+----------+------------------------------------------------------------------+
| occ_code | occ_title                                                        |
+----------+------------------------------------------------------------------+
| 13-2011  | Accountants and auditors                                         |
| 27-2011  | Actors                                                           |
| 15-2011  | Actuaries                                                        |
| 23-1021  | Administrative law judges, adjudicators, and hearing officers    |
| 11-3011  | Administrative services managers                                 |
| 25-3011  | Adult literacy, remedial education, and GED teachers and instruc |
| 11-2011  | Advertising and promotions managers                              |
| 41-3011  | Advertising sales agents                                         |
| 17-3021  | Aerospace engineering and operations technicians                 |
| 17-2011  | Aerospace engineers                                              |
+----------+------------------------------------------------------------------+
10 rows in set (0.80 sec)


mysql> select distinct occ_code, occ_title from MSA where occ_title like "%oftware%" order by occ_title limit 10;
+----------+-----------------------------------------------+
| occ_code | occ_title                                     |
+----------+-----------------------------------------------+
| 15-1031  | Computer software engineers, applications     |
| 15-1032  | Computer software engineers, systems software |
+----------+-----------------------------------------------+
2 rows in set (0.29 sec)


mysql> select count(distinct area) from MSA;
+----------------------+
| count(distinct area) |
+----------------------+
|                  400 |
+----------------------+
1 row in set (0.21 sec)


mysql> select distinct area, area_name from MSA where state="CA";
+-------+----------------------------------------------------------------+
| area  | area_name                                                      |
+-------+----------------------------------------------------------------+
| 12540 | Bakersfield, CA                                                |
| 17020 | Chico, CA                                                      |
| 20940 | El Centro, CA                                                  |
| 23420 | Fresno, CA                                                     |
| 25260 | Hanford-Corcoran, CA                                           |
| 31084 | Los Angeles-Long Beach-Glendale, CA  Metropolitan Division     |
| 31460 | Madera, CA                                                     |
| 32900 | Merced, CA                                                     |
| 33700 | Modesto, CA                                                    |
| 34900 | Napa, CA                                                       |
| 36084 | Oakland-Fremont-Hayward, CA Metropolitan Division              |
| 37100 | Oxnard-Thousand Oaks-Ventura, CA                               |
| 39820 | Redding, CA                                                    |
| 40140 | Riverside-San Bernardino-Ontario, CA                           |
| 40900 | Sacramento--Arden-Arcade--Roseville, CA                        |
| 41500 | Salinas, CA                                                    |
| 41740 | San Diego-Carlsbad-San Marcos, CA                              |
| 41884 | San Francisco-San Mateo-Redwood City, CA Metropolitan Division |
| 41940 | San Jose-Sunnyvale-Santa Clara, CA                             |
| 42020 | San Luis Obispo-Paso Robles, CA                                |
| 42044 | Santa Ana-Anaheim-Irvine, CA Metropolitan Division             |
| 42060 | Santa Barbara-Santa Maria-Goleta, CA                           |
| 42100 | Santa Cruz-Watsonville, CA                                     |
| 42220 | Santa Rosa-Petaluma, CA                                        |
| 44700 | Stockton, CA                                                   |
| 46700 | Vallejo-Fairfield, CA                                          |
| 47300 | Visalia-Porterville, CA                                        |
| 49700 | Yuba City, CA                                                  |
+-------+----------------------------------------------------------------+
28 rows in set (0.31 sec)


mysql> select count(distinct state) from MSA;
+-----------------------+
| count(distinct state) |
+-----------------------+
|                    52 |
+-----------------------+
1 row in set (0.30 sec)


mysql> select area_name, occ_title, a_pct10, a_median, a_pct90 from MSA where area in (36084, 41884, 41940)
    -> and occ_code in ("15-1031", "15-1032") order by a_median DESC\G
*************************** 1. row ***************************
area_name: San Jose-Sunnyvale-Santa Clara, CA
occ_title: Computer software engineers, systems software
  a_pct10: 77940
 a_median: 124830
  a_pct90: 0
*************************** 2. row ***************************
area_name: San Jose-Sunnyvale-Santa Clara, CA
occ_title: Computer software engineers, applications
  a_pct10: 72900
 a_median: 116260
  a_pct90: 156040
*************************** 3. row ***************************
area_name: San Francisco-San Mateo-Redwood City, CA Metropolitan Division
occ_title: Computer software engineers, systems software
  a_pct10: 73780
 a_median: 111450
  a_pct90: 159720
*************************** 4. row ***************************
area_name: San Francisco-San Mateo-Redwood City, CA Metropolitan Division
occ_title: Computer software engineers, applications
  a_pct10: 68220
 a_median: 105560
  a_pct90: 147860
*************************** 5. row ***************************
area_name: Oakland-Fremont-Hayward, CA Metropolitan Division
occ_title: Computer software engineers, applications
  a_pct10: 66120
 a_median: 100470
  a_pct90: 157890
*************************** 6. row ***************************
area_name: Oakland-Fremont-Hayward, CA Metropolitan Division
occ_title: Computer software engineers, systems software
  a_pct10: 60240
 a_median: 94540
  a_pct90: 135730
6 rows in set (0.01 sec)

Comments:

  Textile Help