A basic SQL cheatsheet is available at http://files.zeroturnaround.com/pdf/zt_sql_cheat_sheet.pdf.
There are many commands in SQL some of the most important are illustrated below:
SELECT col_name FROM table_name WHERE condition
UPDATE table_name SET col_name = 5 WHERE condition
INSERT INTO table_name (col_name1, col_name2,...) VALUES (value1, value2, ...)
DELETE FROM table_name WHERE condition;
SELECT COUNT(col_name) FROM table_name WHERE condition
SELECT table1.col1, table1.col2, table2.col1 FROM table1 INNER JOIN table2 on table1.key= table2.key
Using SAS the PROC SQL procedure can be used to query datastate systems or SAS data sets. The syntax follows as:
PROC SQL;
SQL statements;
RUN;
We will focus on SAS syntax, but there are minimal (or no) differences in other applications.
For this lecture and the lab, we will use a subset of baseball tables from https://www.kaggle.com/seanlahman/the-history-of-baseball.
libname STAT408 '/folders/myfolders/';
/* Note when using SAS OnDemand set this libname to
LIBNAME STAT408 "/courses/d716b355ba27fe300";
*/
574 ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
575
576 libname STAT408 '/folders/myfolders/';
NOTE: Libref STAT408 was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders
577
578 /* Note when using SAS OnDemand set this libname to
579 LIBNAME STAT408 "/courses/d716b355ba27fe300";
580 */
581 ods html5 close;ods listing;
582
Select is the workhorse command for SQL queries. Select can be used to query data from tables.
A select query must have a SELECT statement specifying which columns to query and a FROM statement to identify the table. There are additional options below, which must follow this order.
Q1. Create a query from STAT408.player to obtain all the player first and last names for all players born in 1993.
Q2. Select all rows from STAT408.park where the park is in California.
/* Create a query from STAT408.player to obtain all the player
first and last names for all players born in 1993. */
PROC SQL;
SELECT name_first, name_last
FROM STAT408.player
WHERE birth_year = 1993;
RUN;
The SAS System
name_first | name_last |
---|---|
Miguel | Almonte |
Byron | Buxton |
Michael | Conforto |
Zach | Davies |
Michael | Feliz |
Brandon | Finnegan |
Joey | Gallo |
Oscar | Hernandez |
Adrian | Houser |
Keone | Kela |
Max | Kepler |
Jacob | Lindgren |
Francisco | Lindor |
Jorge | Lopez |
Ketel | Marte |
Lance | McCullers |
Frankie | Montas |
Aaron | Nola |
Daniel | Norris |
Jorge | Polanco |
Jurickson | Profar |
Eduardo | Rodriguez |
Joe | Ross |
Miguel | Sano |
Luis | Sardinas |
Kyle | Schwarber |
Pedro | Severino |
Trea | Turner |
Out[7]:
*Select all rows from STAT408.park where the park is in California.;
PROC SQL;
SELECT *
FROM STAT408.park
WHERE state = 'CA';
RUN;
The SAS System
park_id | park_name | park_alias | city | state | country |
---|---|---|---|---|---|
ANA01 | Angel Stadium of Anaheim | Edison Field; Anaheim Stadium | Anaheim | CA | US |
LOS01 | Los Angeles Memorial Coliseum | Los Angeles | CA | US | |
LOS02 | Wrigley Field | Los Angeles | CA | US | |
LOS03 | Dodger Stadium | Chavez Ravine | Los Angeles | CA | US |
OAK01 | Oakland-Alameda County Coliseum | Network Associates Coliseum | Oakland | CA | US |
SAN01 | Qualcomm Stadium | San Diego/Jack Murphy Stadium | San Diego | CA | US |
SAN02 | PETCO Park | San Diego | CA | US | |
SFO01 | Seals Stadium | San Francisco | CA | US | |
SFO02 | Candlestick Park | 3Com Park | San Francisco | CA | US |
SFO03 | AT&T Park | Pacific Bell Park; SBC Park | San Francisco | CA | US |
There are four main sets of aggregate type functions in SQL
Q: How do we write a query to calculate the total number of homeruns in each year since 1980?
*write a query to calculate the total number of homeruns in each year since 1980;
PROC SQL;
SELECT year, SUM(hr)
FROM STAT408.batting
GROUP BY year
HAVING year GT 1979
ORDER BY year;
RUN;
The SAS System
year | |
---|---|
1980 | 3087 |
1981 | 1781 |
1982 | 3379 |
1983 | 3301 |
1984 | 3258 |
1985 | 3602 |
1986 | 3813 |
1987 | 4458 |
1988 | 3180 |
1989 | 3083 |
1990 | 3317 |
1991 | 3383 |
1992 | 3038 |
1993 | 4030 |
1994 | 3306 |
1995 | 4081 |
1996 | 4962 |
1997 | 4640 |
1998 | 5064 |
1999 | 5528 |
2000 | 5693 |
2001 | 5458 |
2002 | 5059 |
2003 | 5207 |
2004 | 5451 |
2005 | 5017 |
2006 | 5386 |
2007 | 4957 |
2008 | 4878 |
2009 | 5042 |
2010 | 4613 |
2011 | 4552 |
2012 | 4934 |
2013 | 4661 |
2014 | 4186 |
2015 | 4909 |
Thus far we have only been querying tables and producing output. Next we will see how to create and modify tables.
PROC SQL;
CREATE TABLE table_name as
SELECT col1, col2, col3
FROM old_table;
RUN;
Q1. Create a table called born93 with info from all players born in 1993.
*\ Create a new table with info from players born in 1993 \*
PROC SQL;
CREATE TABLE born93 as
SELECT player_id, birth_country, birth_state, name_first, name_last FROM STAT408.player
WHERE birth_year = 1993;
RUN;
There are several ways to modify an existing table:
You can manually insert values into a table as follows:
INSERT INTO table_name (col1, col2, col3, ...)
VALUES (val1, val2, val3);
Note the values can be computed from a SELECT statement as well
INSERT INTO table_name (col1, col2, col3, ...)
SELECT val1, val2, val3 FROM table_name2;
Q: Write a PROC SQL command to add your name to the born93 table.
PROC SQL;
INSERT INTO born93 (player_id, birth_country, birth_state, name_first, name_last)
VALUES ('hoeghan01', 'USA','IA','Andrew','Hoegh');
RUN;
PROC SQL;
SELECT *
FROM born93
WHERE birth_state = 'IA';
RUN;
The SAS System
player_id | birth_country | birth_state | name_first | name_last |
---|---|---|---|---|
hoeghan01 | USA | IA | Andrew | Hoegh |
The update value allows you to update multiple records using a WHERE clause.
UPDATE table_name
SET col1 = value
WHERE col2= value;
PROC SQL;
UPDATE born93
SET birth_state = 'Other'
WHERE birth_country NE 'USA';
RUN;
PROC SQL;
SELECT *
FROM born93
WHERE birth_state = 'Other';
RUN;
The SAS System
player_id | birth_country | birth_state | name_first | name_last |
---|---|---|---|---|
almonmi01 | D.R. | Other | Miguel | Almonte |
felizmi01 | D.R. | Other | Michael | Feliz |
hernaos01 | Venezuela | Other | Oscar | Hernandez |
keplema01 | Germany | Other | Max | Kepler |
lindofr01 | P.R. | Other | Francisco | Lindor |
lopezjo02 | P.R. | Other | Jorge | Lopez |
marteke01 | D.R. | Other | Ketel | Marte |
montafr01 | D.R. | Other | Frankie | Montas |
polanjo01 | D.R. | Other | Jorge | Polanco |
profaju01 | Curacao | Other | Jurickson | Profar |
rodried05 | Venezuela | Other | Eduardo | Rodriguez |
sanomi01 | D.R. | Other | Miguel | Sano |
sardilu01 | Venezuela | Other | Luis | Sardinas |
severpe01 | D.R. | Other | Pedro | Severino |
This allows you to delete columns, meeting a specified criteria, from a table.
DELETE FROM table
WHERE condition;
PROC SQL;
DELETE FROM born93
WHERE name_last LIKE "%egh";
RUN;
* query number of rows;
PROC SQL;
SELECT COUNT(*)
FROM born93;
RUN;
The SAS System
28 |
There are several types of joins, some of which can be visualized below:
Inline-style:
SELECT l.var1, l.var2, r.var3, ...
FROM lefttab as l
INNER JOIN righttab as r
ON l.key=r.key;
Other options for the join argument are:
LEFT JOIN
RIGHT JOIN
FULL JOIN
Q: Sketch out a SQL command to conduct an inner join on born93 and batting.
PROC SQL;
SELECT *
FROM born93 as b
INNER JOIN STAT408.batting as s
ON b.player_id = s.player_id;
RUN;
The SAS System
player_id | birth_country | birth_state | name_first | name_last | player_id | year | stint | team_id | league_id | g | ab | r | h | double | triple | hr | rbi | sb | cs | bb | so | ibb | hbp | sh | sf | g_idp |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
profaju01 | Curacao | Other | Jurickson | Profar | profaju01 | 2012 | 1 | TEX | A | 9 | 17 | 2 | 3 | 2 | 0 | 1 | 2 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 1 |
profaju01 | Curacao | Other | Jurickson | Profar | profaju01 | 2013 | 1 | TEX | A | 85 | 286 | 30 | 67 | 11 | 0 | 6 | 26 | 2 | 4 | 26 | 63 | 0 | 5 | 6 | 1 | 1 |
finnebr01 | USA | TX | Brandon | Finnegan | finnebr01 | 2014 | 1 | KCA | A | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
norrida01 | USA | TN | Daniel | Norris | norrida01 | 2014 | 1 | TOR | A | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
polanjo01 | D.R. | Other | Jorge | Polanco | polanjo01 | 2014 | 1 | MIN | A | 5 | 6 | 2 | 2 | 1 | 1 | 0 | 3 | 0 | 0 | 2 | 2 | 0 | 0 | 0 | 0 | 0 |
sardilu01 | Venezuela | Other | Luis | Sardinas | sardilu01 | 2014 | 1 | TEX | A | 43 | 115 | 12 | 30 | 6 | 0 | 0 | 8 | 5 | 1 | 5 | 21 | 0 | 2 | 3 | 0 | 5 |
almonmi01 | D.R. | Other | Miguel | Almonte | almonmi01 | 2015 | 1 | KCA | A | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
buxtoby01 | USA | GA | Byron | Buxton | buxtoby01 | 2015 | 1 | MIN | A | 46 | 129 | 16 | 27 | 7 | 1 | 2 | 6 | 2 | 2 | 6 | 44 | 0 | 1 | 2 | 0 | 1 |
confomi01 | USA | WA | Michael | Conforto | confomi01 | 2015 | 1 | NYN | N | 56 | 174 | 30 | 47 | 14 | 0 | 9 | 26 | 0 | 1 | 17 | 39 | 0 | 1 | 0 | 2 | 4 |
davieza01 | USA | WA | Zach | Davies | davieza01 | 2015 | 1 | MIL | N | 6 | 13 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 1 | 0 | 0 |
felizmi01 | D.R. | Other | Michael | Feliz | felizmi01 | 2015 | 1 | HOU | A | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
finnebr01 | USA | TX | Brandon | Finnegan | finnebr01 | 2015 | 1 | KCA | A | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
finnebr01 | USA | TX | Brandon | Finnegan | finnebr01 | 2015 | 2 | CIN | N | 6 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 0 | 0 |
gallojo01 | USA | NV | Joey | Gallo | gallojo01 | 2015 | 1 | TEX | A | 36 | 108 | 16 | 22 | 3 | 1 | 6 | 14 | 3 | 0 | 15 | 57 | 3 | 0 | 0 | 0 | 0 |
hernaos01 | Venezuela | Other | Oscar | Hernandez | hernaos01 | 2015 | 1 | ARI | N | 18 | 31 | 4 | 5 | 1 | 0 | 0 | 1 | 0 | 0 | 3 | 15 | 0 | 1 | 1 | 0 | 0 |
housead01 | USA | OK | Adrian | Houser | housead01 | 2015 | 1 | MIL | N | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
kelake01 | USA | CA | Keone | Kela | kelake01 | 2015 | 1 | TEX | A | 68 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
keplema01 | Germany | Other | Max | Kepler | keplema01 | 2015 | 1 | MIN | A | 3 | 7 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 |
lindgja01 | USA | MS | Jacob | Lindgren | lindgja01 | 2015 | 1 | NYA | A | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
lindofr01 | P.R. | Other | Francisco | Lindor | lindofr01 | 2015 | 1 | CLE | A | 99 | 390 | 50 | 122 | 22 | 4 | 12 | 51 | 12 | 2 | 27 | 69 | 0 | 1 | 1 | 7 | 1 |
lopezjo02 | P.R. | Other | Jorge | Lopez | lopezjo02 | 2015 | 1 | MIL | N | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 |
marteke01 | D.R. | Other | Ketel | Marte | marteke01 | 2015 | 1 | SEA | A | 57 | 219 | 25 | 62 | 14 | 3 | 2 | 17 | 8 | 4 | 24 | 43 | 0 | 0 | 2 | 2 | 1 |
mcculla02 | USA | FL | Lance | McCullers | mcculla02 | 2015 | 1 | HOU | A | 22 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
montafr01 | D.R. | Other | Frankie | Montas | montafr01 | 2015 | 1 | CHA | A | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
nolaaa01 | USA | LA | Aaron | Nola | nolaaa01 | 2015 | 1 | PHI | N | 13 | 23 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 16 | 0 | 0 | 2 | 0 | 0 |
norrida01 | USA | TN | Daniel | Norris | norrida01 | 2015 | 1 | TOR | A | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
norrida01 | USA | TN | Daniel | Norris | norrida01 | 2015 | 2 | DET | A | 8 | 2 | 1 | 1 | 0 | 0 | 1 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
polanjo01 | D.R. | Other | Jorge | Polanco | polanjo01 | 2015 | 1 | MIN | A | 4 | 10 | 1 | 3 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 |
rodried05 | Venezuela | Other | Eduardo | Rodriguez | rodried05 | 2015 | 1 | BOS | A | 21 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 |
rossjo01 | USA | CA | Joe | Ross | rossjo01 | 2015 | 1 | WAS | N | 17 | 27 | 2 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 0 | 1 | 2 | 0 | 1 |
sanomi01 | D.R. | Other | Miguel | Sano | sanomi01 | 2015 | 1 | MIN | A | 80 | 279 | 46 | 75 | 17 | 1 | 18 | 52 | 1 | 1 | 53 | 119 | 1 | 1 | 0 | 2 | 4 |
sardilu01 | Venezuela | Other | Luis | Sardinas | sardilu01 | 2015 | 1 | MIL | N | 36 | 97 | 8 | 19 | 0 | 1 | 0 | 4 | 0 | 0 | 6 | 25 | 1 | 0 | 1 | 1 | 3 |
schwaky01 | USA | OH | Kyle | Schwarber | schwaky01 | 2015 | 1 | CHN | N | 69 | 232 | 52 | 57 | 6 | 1 | 16 | 43 | 3 | 3 | 36 | 77 | 1 | 4 | 0 | 1 | 4 |
severpe01 | D.R. | Other | Pedro | Severino | severpe01 | 2015 | 1 | WAS | N | 2 | 4 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
turnetr01 | USA | FL | Trea | Turner | turnetr01 | 2015 | 1 | WAS | N | 27 | 40 | 5 | 9 | 1 | 0 | 1 | 1 | 2 | 2 | 4 | 12 | 0 | 0 | 0 | 0 | 0 |
Now conduct a query that is a left join on born93 (as the left table) with batting as the right table.
PROC SQL;
SELECT *
FROM born93 as b
LEFT JOIN STAT408.batting as s
ON b.player_id = s.player_id;
RUN;
PROC SQL;
SELECT *
FROM born93 as b
LEFT JOIN STAT408.batting as s
ON b.player_id = s.player_id;
RUN;
The SAS System
player_id | birth_country | birth_state | name_first | name_last | player_id | year | stint | team_id | league_id | g | ab | r | h | double | triple | hr | rbi | sb | cs | bb | so | ibb | hbp | sh | sf | g_idp |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
almonmi01 | D.R. | Other | Miguel | Almonte | almonmi01 | 2015 | 1 | KCA | A | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
buxtoby01 | USA | GA | Byron | Buxton | buxtoby01 | 2015 | 1 | MIN | A | 46 | 129 | 16 | 27 | 7 | 1 | 2 | 6 | 2 | 2 | 6 | 44 | 0 | 1 | 2 | 0 | 1 |
confomi01 | USA | WA | Michael | Conforto | confomi01 | 2015 | 1 | NYN | N | 56 | 174 | 30 | 47 | 14 | 0 | 9 | 26 | 0 | 1 | 17 | 39 | 0 | 1 | 0 | 2 | 4 |
davieza01 | USA | WA | Zach | Davies | davieza01 | 2015 | 1 | MIL | N | 6 | 13 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 1 | 0 | 0 |
felizmi01 | D.R. | Other | Michael | Feliz | felizmi01 | 2015 | 1 | HOU | A | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
finnebr01 | USA | TX | Brandon | Finnegan | finnebr01 | 2015 | 1 | KCA | A | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
finnebr01 | USA | TX | Brandon | Finnegan | finnebr01 | 2015 | 2 | CIN | N | 6 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 2 | 0 | 0 |
finnebr01 | USA | TX | Brandon | Finnegan | finnebr01 | 2014 | 1 | KCA | A | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
gallojo01 | USA | NV | Joey | Gallo | gallojo01 | 2015 | 1 | TEX | A | 36 | 108 | 16 | 22 | 3 | 1 | 6 | 14 | 3 | 0 | 15 | 57 | 3 | 0 | 0 | 0 | 0 |
hernaos01 | Venezuela | Other | Oscar | Hernandez | hernaos01 | 2015 | 1 | ARI | N | 18 | 31 | 4 | 5 | 1 | 0 | 0 | 1 | 0 | 0 | 3 | 15 | 0 | 1 | 1 | 0 | 0 |
housead01 | USA | OK | Adrian | Houser | housead01 | 2015 | 1 | MIL | N | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
kelake01 | USA | CA | Keone | Kela | kelake01 | 2015 | 1 | TEX | A | 68 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
keplema01 | Germany | Other | Max | Kepler | keplema01 | 2015 | 1 | MIN | A | 3 | 7 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 |
lindgja01 | USA | MS | Jacob | Lindgren | lindgja01 | 2015 | 1 | NYA | A | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
lindofr01 | P.R. | Other | Francisco | Lindor | lindofr01 | 2015 | 1 | CLE | A | 99 | 390 | 50 | 122 | 22 | 4 | 12 | 51 | 12 | 2 | 27 | 69 | 0 | 1 | 1 | 7 | 1 |
lopezjo02 | P.R. | Other | Jorge | Lopez | lopezjo02 | 2015 | 1 | MIL | N | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 |
marteke01 | D.R. | Other | Ketel | Marte | marteke01 | 2015 | 1 | SEA | A | 57 | 219 | 25 | 62 | 14 | 3 | 2 | 17 | 8 | 4 | 24 | 43 | 0 | 0 | 2 | 2 | 1 |
mcculla02 | USA | FL | Lance | McCullers | mcculla02 | 2015 | 1 | HOU | A | 22 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
montafr01 | D.R. | Other | Frankie | Montas | montafr01 | 2015 | 1 | CHA | A | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
nolaaa01 | USA | LA | Aaron | Nola | nolaaa01 | 2015 | 1 | PHI | N | 13 | 23 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 16 | 0 | 0 | 2 | 0 | 0 |
norrida01 | USA | TN | Daniel | Norris | norrida01 | 2014 | 1 | TOR | A | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
norrida01 | USA | TN | Daniel | Norris | norrida01 | 2015 | 2 | DET | A | 8 | 2 | 1 | 1 | 0 | 0 | 1 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
norrida01 | USA | TN | Daniel | Norris | norrida01 | 2015 | 1 | TOR | A | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
polanjo01 | D.R. | Other | Jorge | Polanco | polanjo01 | 2014 | 1 | MIN | A | 5 | 6 | 2 | 2 | 1 | 1 | 0 | 3 | 0 | 0 | 2 | 2 | 0 | 0 | 0 | 0 | 0 |
polanjo01 | D.R. | Other | Jorge | Polanco | polanjo01 | 2015 | 1 | MIN | A | 4 | 10 | 1 | 3 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 |
profaju01 | Curacao | Other | Jurickson | Profar | profaju01 | 2013 | 1 | TEX | A | 85 | 286 | 30 | 67 | 11 | 0 | 6 | 26 | 2 | 4 | 26 | 63 | 0 | 5 | 6 | 1 | 1 |
profaju01 | Curacao | Other | Jurickson | Profar | profaju01 | 2012 | 1 | TEX | A | 9 | 17 | 2 | 3 | 2 | 0 | 1 | 2 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 1 |
rodried05 | Venezuela | Other | Eduardo | Rodriguez | rodried05 | 2015 | 1 | BOS | A | 21 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 |
rossjo01 | USA | CA | Joe | Ross | rossjo01 | 2015 | 1 | WAS | N | 17 | 27 | 2 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 11 | 0 | 1 | 2 | 0 | 1 |
sanomi01 | D.R. | Other | Miguel | Sano | sanomi01 | 2015 | 1 | MIN | A | 80 | 279 | 46 | 75 | 17 | 1 | 18 | 52 | 1 | 1 | 53 | 119 | 1 | 1 | 0 | 2 | 4 |
sardilu01 | Venezuela | Other | Luis | Sardinas | sardilu01 | 2014 | 1 | TEX | A | 43 | 115 | 12 | 30 | 6 | 0 | 0 | 8 | 5 | 1 | 5 | 21 | 0 | 2 | 3 | 0 | 5 |
sardilu01 | Venezuela | Other | Luis | Sardinas | sardilu01 | 2015 | 1 | MIL | N | 36 | 97 | 8 | 19 | 0 | 1 | 0 | 4 | 0 | 0 | 6 | 25 | 1 | 0 | 1 | 1 | 3 |
schwaky01 | USA | OH | Kyle | Schwarber | schwaky01 | 2015 | 1 | CHN | N | 69 | 232 | 52 | 57 | 6 | 1 | 16 | 43 | 3 | 3 | 36 | 77 | 1 | 4 | 0 | 1 | 4 |
severpe01 | D.R. | Other | Pedro | Severino | severpe01 | 2015 | 1 | WAS | N | 2 | 4 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
turnetr01 | USA | FL | Trea | Turner | turnetr01 | 2015 | 1 | WAS | N | 27 | 40 | 5 | 9 | 1 | 0 | 1 | 1 | 2 | 2 | 4 | 12 | 0 | 0 | 0 | 0 | 0 |
SAS Macro code consists of two components:
Macro variables is a single variable that does not belong to a data set and the value is always the character. Names of macro variables are begin with an ampersand, such as ¯ovar.
A macro is similar to an R function, but still bundles SAS DATA and PROC statements. Marco names start with a percent sign, such as %compile
.
Macro variables are particularly useful in industrial settings where procedures are run every week or month. Rather than modifying all of the code, a macro variable can be used to adjust the week or month.
Macro variables are defined with the following syntax
%LET year = 1993;
Q: Now write code to query the player data set to extract the count of players by birth year.
%LET year = 1993;
TITLE "Number of MLB players born in &year";
PROC SQL;
SELECT COUNT(*)
FROM STAT408.player
WHERE birth_year = &year;
RUN;
TITLE;
Number of MLB players born in 1993
28 |
%LET year = 1983;
TITLE "Number of MLB players born in &year";
PROC SQL;
SELECT COUNT(*)
FROM STAT408.player
WHERE birth_year = &year;
RUN;
TITLE;
Number of MLB players born in 1983
243 |
Similar to the last example, and functions in R, SAS macros can be used to streamline code for repeated procedures. As with R, debugging inside a macro environment can be more complicated. So make sure to that the code inserted in the macro statement is working before applying the macro commands.
The Little SAS Book describes macros as a sandwich,
%MACRO macro_name (parameter1= , ...);
macro_text
%MEND macro_name;
where the %MACRO
and %MEND
statements are the bread. After being defined, the macro can then be run using the following statement %macro_name(val1);
(if one parameter is defined).
Q: Now write a macro to print the number baseball players born in the years 1983, 1990, and 1993.
%MACRO yearborn (year =);
libname STAT408 '/folders/myfolders/';
TITLE "Number of MLB players born in &year";
PROC SQL;
SELECT COUNT(*)
FROM STAT408.player
WHERE birth_year = &year;
RUN;
TITLE;
%MEND yearborn;
%yearborn(year = 1983);
%yearborn(year = 1990);
%yearborn(year = 1993);
Number of MLB players born in 1983
243 |
Number of MLB players born in 1990
159 |
Number of MLB players born in 1993
28 |
Conditional logic can also be used inside a macro statements. For instance continuing the previous example, if we also want to include an option to create a table with the players names this can be done with conditional logic.
%MACRO yearborn (year =, create=);
libname STAT408 '/folders/myfolders/';
TITLE "Number of MLB players born in &year";
PROC SQL;
SELECT COUNT(*)
FROM STAT408.player
WHERE birth_year = &year;
RUN;
TITLE;
%IF &create = 'YES' %THEN %DO;
PROC SQL;
CREATE TABLE born&year as
SELECT *
FROM STAT408.player
WHERE birth_year = &year;
RUN;
%END;
%MEND yearborn;
%yearborn(year = 1995, create= 'YES');
Number of MLB players born in 1995
2 |
%yearborn(year = 1994, create = 'NO');
PROC PRINT DATA=born1994;
RUN;
779 ods listing close;ods html5 file=stdout options(bitmap_mode='inline') device=png; ods graphics on / outputfmt=png;
NOTE: Writing HTML5 Body file: STDOUT
780
781 %yearborn(year = 1994, create = 'NO');
NOTE: Libref STAT408 was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders
NOTE: PROCEDURE SQL used (Total process time):
real time 1.70 seconds
cpu time 0.06 seconds
NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.
782
NOTE: PROCEDURE SQL used (Total process time):
real time 0.02 seconds
cpu time 0.02 seconds
783 PROC PRINT DATA=born1994;
ERROR: File WORK.BORN1994.DATA does not exist.
784 RUN;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
785 ods html5 close;ods listing;
786
Number of MLB players born in 1994
8 |
PROC PRINT DATA=born1995;
RUN;
Obs | player_id | birth_year | birth_month | birth_day | birth_country | birth_state | birth_city | death_year | death_month | death_day | death_country | death_state | death_city | name_first | name_last | name_given | weight | height | bats | throws | debut | final_game | retro_id | bbref_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | mondera02 | 1995 | 7 | 27 | USA | CA | Los Angeles | . | . | . | Raul | Mondesi | Raul Adalberto | 185 | 73 | B | R | . | . | mondr003 | mondera02 | |||
2 | osunaro01 | 1995 | 2 | 7 | Mexico | Sinaloa | Juan Jose Rios | . | . | . | Roberto | Osuna | Roberto | 230 | 74 | R | R | 2015-04-08 | 2015-10-03 | osunr001 | osunaro01 |
IML stands for Interactive Matrix Language and provides some matrix algebra capacity. http://www.yorku.ca/pek/index_files/quickstart/IMLQuickStart.pdf
We are not going to cover it in detail, I only want you to know that it does exist.
proc iml;
a = I(3); * 3x3 identity matrix;
b = j(4,4,1); *4x4 matrix of 1's;
c=diag({1 2 4});
d=({1 2, 3 4});
print a; print b; print c; print d;
quit;
a | ||
---|---|---|
1 | 0 | 0 |
0 | 1 | 0 |
0 | 0 | 1 |
b | |||
---|---|---|---|
1 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
c | ||
---|---|---|
1 | 0 | 0 |
0 | 2 | 0 |
0 | 0 | 4 |
d | |
---|---|
1 | 2 |
3 | 4 |
proc iml;
X ={1 2, 3 4};
Y ={-4 3,-2 -1};
XY = X*Y; *matrix multiplication;
tX = t(X); *transpose using function;
DX = det(X); *matrix determinant;
X_inv = inv(X); *matrix inverse;
X_eigval = eigval(X); *eigen values of matrix;
X_eigvec = eigvec(X); *eigen vector of matrix;
print X; print Y; print XY; print tX; print DX; print X_inv; print X_eigval; print X_eigvec; quit;
X | |
---|---|
1 | 2 |
3 | 4 |
Y | |
---|---|
-4 | 3 |
-2 | -1 |
XY | |
---|---|
-8 | 1 |
-20 | 5 |
tX | |
---|---|
1 | 3 |
2 | 4 |
DX |
---|
-2 |
X_inv | |
---|---|
-2 | 1 |
1.5 | -0.5 |
X_eigval | |
---|---|
5.3722813 | 0 |
-0.372281 | 0 |
X_eigvec | |
---|---|
-0.415974 | -0.824565 |
-0.909377 | 0.5657675 |