Finding Database Network Abusers with Netstat

DBAs are often faced with the encompassing task of trying to reduce the use of system resources. In doing so, DBAs have the ominous task of having to relate the consumption of system resources by a database to the available resources on the database server it runs on. Sometimes this is a very difficult task.

Take for instance the use of network resources. When we look at database statistics accumulating from within the database we get a glimpse of what is actually happening with network traffic for the database as a whole or for individual sessions. What is often looked at are the SQL*Net message from/to series of statistics. Take for example two of the most common of these types of statistics:

Common Network Database Statistics

SQL*Net message from client – This is the time it took for a message to arrive from the client since the last message was sent to the client.

SQL*Net message to client – This is the time it took for the sending of a message to the client.

While these give an indication of how long it took for messages to send, these statistics really don’t give a good indication of how busy the network is for database activity. For this type of information we must begin our investigation with statistics that are measured at the operating system / network level. From this level we can then relate network resource consumption, possibly abusers of the network, to database sessions that are executing within the database. Take the following example:

Like any database monitoring endeavor, trapping the use of a resource is the first trick in determining who is using the resource and who might be abusing it. For our purposes, the UNIX utility netstat (network statistics) is a sufficient tool to trap network users. The netstat utility is a command line utility that displays both incoming and outgoing network traffic and is often the utility of choice for finding problems within the network and determining the amount of traffic within the network that might be impacting performance. Like any utility, when issuing the command there are many options, netstat is no different. I’d encourage you to investigate the various options. For purposes of this example, we will focus on the following parameters:

Netstat Parameters

-t : Displays only TCP connections

-p : Shows which processes are using which sockets

Netstat Statistics Returned from -t & -p Parameter Options

Proto : The protocol (tcp, udp, raw) used by the socket, for our example this will be tcp

Recv-Q : Receiving queue, the count of bytes not copied by the user program connected to this socket

Send-Q : Sending queue, the count of bytes not acknowledged by the remote host

Local Address : Address and port number of the local end of the socket, this is the address/connection on the database server side

Foreign Address : Address and port number of the remote end of the socket, this is the address of the issuing client

State : State of the socket

PID/Program name : PID (process ID) and program name, very important for linking PID on the operating system side to the session/process within the database

To trap output from those UNIX commands that do not have a refresh parameter, I like to put together a little script that will do something similar. For netstat I use the following, which accepts three parameters: parameters for the sleep time between samples, the number of rows to return for display, and the network queue you are interested in. When executed, this script behaves a lot like the top command; continually refreshing the screen for the number of top sessions you’ve selected.

#!/bin/sh
# -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
# monitorNETSTAT.sh
# Use netstat to display tcp network usage by process
# -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
if [ $# != 3 ]
then
echo "Usage monitorNETSTAT.sh sleep topN sortQ"
echo "Where sleep - seconds to sleep between samples"
echo " topN - top number of rows to return"
echo " sortQ - [r|s] to sort on (r)ecv-q or (s)end-q"
exit 1
fi
SLEEP=$1
TOPN=$2
if [ "${3}" = "r" ]
then
SORTQ=2
else
SORTQ=3
fi
while [ 1 ]
do
netstat -t -p | grep Recv-Q > netstatHEAD.lst
netstat -t -p | grep tcp | sort -k ${SORTQ} -g -r > netstatDETAIL.lst
clear
cat netstatHEAD.lst
head -n ${TOPN} netstatDETAIL.lst
echo " -- -- -- -- -- -- -"
echo "Status Counts"
echo " -- -- -- -- -- -- -"
cat netstatDETAIL.lst | grep tcp | cut -c 77-88 | sort -u |
while read netstatSTATUS
do
statusCnt=`cat netstatDETAIL.lst | grep $netstatSTATUS | wc -l`
echo $netstatSTATUS $statusCnt
done
sleep ${SLEEP}
done

But before executing this script, let’s first log into a database and gather some static information on the session we’d like to capture for this example. To log into the database I use the following connect string:

C:Usersjkoopmann>sqlplus auser/apassword@//192.168.35.112:1521/ORADB
SQL*Plus: Release 11.2.0.2.0 Production on Fri Apr 6 18:27:16 2012
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Release 10.2.0.4.0 - 64bit Production

Now, I can execute the following SQL to show my current session information. Obviously you won’t know the session / process that is causing network usage problems but this gives us a nice starting point to just take a look at what we should expect.

SQL> set heading off
SQL> select 'Sid, Serial#, Aud sid : '|| s.sid||' , '||s.serial#||' , '||s.audsid a,
'DB User / OS User : '||s.username||' / '||s.osuser a,
'OS Process Ids : '||s.process||' (Client) '||p.spid||' (Server)' a,
'Client Program Name : '||s.program a
from v$process p,
v$session s
where p.addr = s.paddr
and s.audsid = userenv('SESSIONID');
Sid, Serial#, Aud sid : 1897 , 16112 , 206249940
DB User / OS User : AUSER / jkoopmann
OS Process Ids : 6576:3416 (Client) 4107 (Server)
Client Program Name : sqlplus.exe

Key points to notice from the above sequence is that I’ve connected to a database as user AUSER, the IP address of the database, the port number, the service name, and the OS process ID (4107).

It is equally important to see what the IP address is on my local client, as this will also be reported within the netstat output. For this I just issue the ipconfig command and not that my local IP:

C:Usersjkoopmann>ipconfig /all
Ethernet adapter Local Area Connection 2:
….
Connection-specific DNS Suffix . : somewhere.net
Description . . . . . . . . . . . : Cisco Systems VPN Adapter for 64-bit Windows
Physical Address. . . . . . . . . : xx-xx-xx-xx-xx-xx
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Link-local IPv6 Address . . . . . : xxxx::xxxx:xxxx:xxxx:xxxx%xx (Preferred)
IPv4 Address. . . . . . . . . . . : 192.168.5.123(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0

To generate some network traffic, just issue some form of large Cartesian join:

SQL> SELECT a.*, b.*, c.*, d.* FROM dba_tables a, dba_tables b, dba_tables c, dba_tables d;

Run the monitorNETSTAT.sh script:

C:Usersjkoopmann>
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 13969 192.168.35.112:1521 192.168.5.123:58097 ESTABLISHED 4107/oracleORADB
tcp 0 360 ::ffff:192.168.35.112:22 ::ffff:192.168.5.123:53822 ESTABLISHED -
tcp 0 17 192.168.35.112:1521 192.168.35.234:3083 ESTABLISHED 15296/oracleORADB

Notice that the Send-Q for PID 407 is backlogged for my connection to the ORADB.

Key points to notice are that you can see the IP of the database server, the IP address of my client machine, and most importantly the PID (process ID) for the connection. This PID can now be taken and used with any of your favorite SQL scripts that use V$PROCESS.SPID and monitor session activity within the database.

By using netstat, you can easily see if there is a network bottleneck. Recv-Q and Send-Q for all intents and purposes should nearly always be zero unless there is a network issue. Under normal circumstances Recv-Q and Send-Q may have some value but they should not pile up and remain non-zero for long. If these queues remain non-zero it is a simple enough task to find the process ID within the database and determine the abusive SQL. Now, the next time someone asks you if the database is consuming network resources or is having network usage issues, you can easily turn to netstat for your answer.

See all articles by James Koopmann

James Koopmann
James Koopmann
James Koopmann has fourteen years of database design, development and performance tuning experience. In addition, he has extensive database administration experience in Oracle and other relational databases in production environments, specializing in performance tuning of database engines and SQL based applications. Koopmann is an accomplished author with several technical papers in various Oracle related publications such as Oracle Magazine, Oracle Professional and SQL>UPDATE_RMOUG. He is a featured author and database expert for DatabaseJournal, a member of the editorial review committee for Select Journal (The Magazine for the International Oracle Users Group), an Oracle Certified Professional DBA and noted speaker at local Oracle User Groups around the country.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles