Department of Computer Science & Engineering University of California - Riverside
CS166: Database Management Systems
FALL 2004
POSTGRES MANUAL
INDEX
Postgres Configuration Manual on Linux
1) Log in with your username under Linux
2) Create the directory structure and get started
3) Initialize the database template
4) Start the Database
5) Shutdown the database
6) Using the psql Client Terminal
7) Declaring Tables :- DDL (Data Definition Language)
8) Manipulating Data :- DML (Data Manipulation Language)
9) Bulk Loading Data
10) Using the right Data types
Detailed Manual can be located on the web page: http://developer.postgresql.org/docs/postgres/
Postgres Configuration Manual on Linux
1) Log in with your username under Linux.
This year you no longer get accounts specific to each course that you take. Instead, you've been given a folder called Classes that is located in your home directory. Inside this folder you should find a symbolic link to a folder called cs166. Many of the files that you will be using for this project are quite large and will cause you student account to exceed its storage limits. Therefore you should place the files within the cs166 folder that will provide the space you will need for these files. Postgres v7.3 is already installed on the system and should be available from the labs located in the Surge Building.
Steps 2-3 should only be performed once
2) Create the directory structure and get started
Within your cs166 folder you should create a directory called test. Within the test folder create the three folders database, tables, and data.
$ mkdir test
$ mkdir test/mydb This will create an empty folder. In this folder you will create the database template as described in section 3.
$ mkdir test/tables This will create an empty folder. In this folder you should place your table definitions as described in section 8.
$ mkdir test/data In this folder you should store anything related to data, including the given data files.
3) Initialize the database template.
Export the environment variable "PGDATA", pointing to the directory where the database is located
$ export PGDATA=$PWD/mydb
Then call "initdb" to create an initial database template. This basically creates the initial structures required by the database. It is not the mydb Database. These are just structures required by Postgres. This template can then be used to host any number of databases.
$ initdb
(You are not expected to understand what each message is supposed to mean)
4) Start the DBMS
Now you have to start the Postgres DBMS by specifying a database instance (i.e. mydb). Recall that this is just the Database without any tables, queries, etc. Below are the command and the result:
$ pg_ctl -o -i -l logfile start
postmaster successfully started
* The "-o -i" option is necessary for connecting to the database through java
** A log name logfile is created
Recall that now the database is running in the background and the DBMS will still operate even if you logout.
YOU SHOULD ALWAYS SHUTDOWN THE DBMS BEFORE LOGGING OUT
5) Shutdown the database
You should stop the database politely by issuing the following command.
$ pg_ctl stop
You should see the following message:
waiting for postmaster to shut down......done
postmaster successfully shut down
6) Create the mydb Database
In order to proceed to this phase you should start the database if it is not started already.
First issue the following command which creates the volcano database:
$ createdb mydb
In order to drop the whole database
$ dropdb mydb
(ATTENTION EVERYTHING WILL BE LOST - THE DATABASE, TABLES, DATA, etc)
The below figure shows what we have created so far
7) Using the psql Client Terminal
Now we will create launch the psql terminal from where we can start creating the tables, performing sql commands and other things.
Below is the command and the expected result:
$ psql mydb
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
mydb=#
Be aware that any input is only parsed by the psql parser once the ";" character is found. Of course the below tokens are not recognized by the parser
mydb=# HI
mydb-# THERE
mydb-# THIS
mydb-# STOPS WHEN THE ';' is found
mydb-# ;
ERROR: parser: parse error at or near "HI" at character 1
mydb=#
The below picture displays graphically the components discussed so far:
8) Declaring Tables - DDL (Data Definition Language)
While you are in the psql environment you can create the tables in 2 ways.
1) Manually - You can either start writing the definition directly while you are in the environment e.g.
mydb=# CREATE TABLE students (
mydb(# sid integer,
mydb(# sname text,
mydb(# sex text,
mydb(# age integer,
mydb(# year integer,
mydb(# gpa numeric
mydb(# );
CREATE TABLE
mydb=#
The disadvantage of this approach is that you have to manually input all definitions every time you want to create a table, run queries etc.
If the database is corrupted then you will loose all your work!
2) From File
You should store each table definition (both create command and drop command) in separate text files. For example:
~/Classes/cs166/test/tables/create.sql
~/Classes/cs166/test/tables/drop.sql
~/Classes/cs166/test/tables/load.sql
Here are sample files for create, load and drop.
After that you simply copy/paste from the above files into the psql console or even better run in linux shell:
$ psql -f ~/Classes/cs166/test/tables/create.sql mydb
This will create ALL tables sequentially!
For further info regarding Postgres DDL consult: http://developer.postgresql.org/docs/postgres/ddl.html
9) Manipulating Data - DML (Data Manipulation Language)
Again, while you are in the psql environment you can manipulate data by INSERTing, UPDATEing DELETEing records.
Suppose that the products table from the previous section is already created. Now we can add some data into the table. After that we issue a select statement to see what happened.
mydb=# INSERT INTO Students VALUES (1,'Jacobs, T.','m',29,5,3.60);
mydb=# SELECT * FROM Students;
sid | sname | sex | age | year | gpa
----+------------+-----+-----+------+-----
1 | Jacobs, T. | m | 29 | 5 | 3.60
(1 row)
mydb=# UPDATE Students SET year = 4 WHERE age = 29;
ydb=# SELECT * FROM Students;
sid | sname | sex | age | year | gpa
----+------------+-----+-----+------+-----
1 | Jacobs, T. | m | 29 | 4 | 3.60
(1 row)
mydb=# DELETE FROM Students WHERE year = 4;
mydb=# SELECT * FROM Students;
sid | sname | sex | age | year | gpa
----+------------+-----+-----+------+-----
(0 rows)
For further info regarding Postgres DML consult : http://developer.postgresql.org/docs/postgres/dml.html
10) Bulk Loading Data
Now you may realize that if somebody gives you 1000 records you won't really write 1000 INSERT statements.
Like most database systems, Postgres comes with a bulk loading function.
For demonstration suppose that you have created the following text file
/data/student.txt
29
23
24
25
24
\.
* Attention to the end file character \.
The COPY function allows you to bulk-load these entries into the system.
mydb=# COPY students(age) FROM '/data/student.txt';
mydb=# SELECT * FROM Students;
sid | sname | sex | age | year | gpa
----+------------+-----+-----+------+-----
| | | 29 | |
| | | 23 | |
| | | 24 | |
| | | 25 | |
| | | 24 | |
(5 rows)
You can see that we explicitly identified what are these values (i.e. age). Since there is no key constrain all values for the other fields can contain a null value!
Now lets see how we can bulk load data that are tab-separated :
/data/student.txt
1 "Jacobs, T." "m" 29 5 3.60
2 "Pierson, E." "m" 32 5 3.50
3 "Zeene, Ben N." "m" 21 5 3.90
\.
Loading tab-separated Values
mydb=# COPY students(sid, sname, sex, age, year, gpa)
mydb-# FROM '/data/student.txt'
mydb-# WITH DELIMITER '\t';
mydb=# SELECT * FROM Students;
sid | sname | sex | age | year | gpa
----+----------------+-----+-----+------+-----
1 | "Jacobs, T." | "m" | 29 | 5 | 3.60
2 | "Pierson, E." | "m" | 32 | 5 | 3.50
3 | "Zeene, Ben N." | "m" | 21 | 5 | 3.90
(3 rows)
You can get some bulk data from here
11) Using the right data types
In this test practice you may use any supported data type you find more appropriate. For a full list of data types supported by Postgres you should consult http://developer.postgresql.org/docs/postgres/datatype.html .
For example for a street address you may use the datatype TEXT or datatype CHAR(300) because this attribute has not a well-known size (so you can either restrict it or not). Other fields such as SSN, Phone Numbers or others should use the appropriate length i.e. (9 or 11). In this practice you are expected to use the right field sizes for well-known attributes.
Below we will discuss some important data types which might be unknown to you from your programming courses.
a) Serial
If you need to insert some records in the database but don't want to worry about the record's IDs you should take advantage of the SERIAL data type. This data type is found in most database systems with different name (such as auto-increment or identity)
CREATE TABLE example (
id serial,
descr TEXT NOT NULL,
PRIMARY KEY (id));
In the example below you can see that the ID field is automatically assigned:
mydb=# INSERT INTO example (descr) VALUES ('some data...');
mydb=# INSERT INTO example (descr) VALUES ('some data...');
mydb=# INSERT INTO example (descr) VALUES ('some data...');
mydb=# SELECT * FROM EXAMPLE;
id | descr
----+--------------
1 | some data...
2 | some data...
3 | some data...
(3 rows)
b) Timestamps
Timestamps are a concatenation of a date and a time (ignore anything regarding time-zones). They are convenient for tracking events. In the example you can see their power when they are used with the DEFAULT keyword. The result is that every time you INSERT a record in this table it will automatically obtain the time at which the record was entered (in the "enter" field).
CREATE TABLE timef (
timefid serial NOT NULL,
entered TIMESTAMP DEFAULT now();
descr TEXT NOT NULL,
PRIMARY KEY (timefid));
mydb=# INSERT INTO timef (descr) VALUES ('This island is cool');
INSERT 17069 1
mydb=# SELECT * FROM timef;
timefid | entered | descr
---------+----------------------------+--------------------
1 | 2003-02-01 18:53:59.154433 | This island is cool
(1 row)
mydb=# INSERT INTO timef (descr) VALUES ('This island is cool');
INSERT 17070 1
mydb=# SELECT * FROM timef;
timefid | entered | descr
---------+----------------------------+--------------------
1 | 2003-02-01 18:53:59.154433 | This island is cool
2 | 2003-02-01 18:54:15.566001 | This island is cool
(2 rows)
For further info regarding Postgres Data types, consult: http://developer.postgresql.org/docs/postgres/datatype.html