Showing posts with label Networking. Show all posts
Showing posts with label Networking. Show all posts

Tuesday, May 20, 2014

Deleting locked IPSec SAs from Fortigates

We have had a borked IPSec Phase1 definition in our configuration since the initial configuration. The delete option was grayed out for it, despite the ref count showing 0. I finally had to call Fortinet about it. The engineer I spoke with said that the ref count of 0 doesn't necessarily mean that there aren't any references (what good is the ref count then?). He grabbed a copy of the configuration, and searched for the name of the Phase1. Sure enough, a policy routing entry turned up that we had long forgotten about. After removing this, I was able to delete the Phase1.

Monday, May 19, 2014

Advertising arbitrary routes via OSPF on Fortigate

To be clear, I'm not sure this is the correct way to inject routes into OSPF. That being said, it yields the desired behavior. The situation goes like this...

I have addressed all of the interfaces of my Fortigate (FGT) with subnets of network 65.65.65.0/24. Additionally, I have some virtual IPs (VIPs) defined that map addresses from 65.65.64.0/24 to corresponding addresses in 65.65.65.0/24. For example:

65.65.64.1 -> 65.65.65.1

This is not a garden-variety configuration, mapping one public subnet to another. The reasons are complex, involving BGP, portable subnets, multiple data centers. The bottom line, I need the FGT to NAT this traffic.

My initial solution to this problem was static routes. However, this becomes difficult to maintain as the network grows in complexity (we're definitely into that territory). What I want to do is advertise a subnet of 65.65.64.0/24 via OSPF. In the Fortigate, it's not as easy as saying "inject this subnet into OSPF." My solution, create a loopback interface on the FGT, and redistribute the connected subnet into OSPF.
  1. Create a Loopback network interface, with an address in the subnet you want to advertise. It doesn't seem to make a difference what address you use.
        edit "port-loopback"
            set vdom "root"
            set ip 65.65.64.1 255.255.255.192
            set type loopback
            set description "Loopback interface used to provide route for portable addresses."
            set snmp-index 28
        next

  2. Create a prefix-list entry that identifies the loopback subnet.
        edit "connected-to-ospf-v4"
            set comments "Define connected routes to export to OSPF"
                config rule
                    edit 10
                        set prefix 65.65.64.0 255.255.255.192
                        unset ge
                        unset le
                    next

  3. Create a route-map that uses the prefix list.
        edit "rm-connected-to-ospf"
            set comments "Defines IPv4 connected routes to redistribute to OSPF"
                config rule
                    edit 10
                        set match-ip-address "connected-to-ospf-v4"
                    next
                end
        next

  4. Configure OSPF to redistribute connected networks.
            config redistribute "connected"
                set status enable
                set routemap "rm-connected-to-ospf"
            end

Tuesday, May 6, 2014

Fortigate VIPs ate my packets.

We traded our Cisco ASAs for Fortinet Fortigates (FGT). So far, the trade-off seems to be a pretty, useable interface, for the rock solid (albeit annoying) functionality of the Cisco. One major issue (for us) that came up during our rollout was related to Virtual IPs (VIPs), essentially Fortinet parlance for destination NAT.

We have a very odd NAT situation. For a particular service we offer, we have clients that are incapable of connecting to the listening port (more accurately, the amount of red tape required to change a port number in a their script requires hundreds of hours of meetings, and many thousands of developer hours).

As a result, we have supported these clients by using a port redirection, but only for certain source addresses, because the port they MUST connect to is in use by another application (confused yet? I am).  On the Fortigate, we solved this by creating a pair of VIPs. One broad, for all ports; The other specific to the goofy awfulness. Here is the example of how we made it work.

edit "srv-v4-redir"
set src-filter "1.2.3.4" "5.6.7.8"
    set extip 111.222.0.1
    set extintf "any"
    set portforward enable
    set mappedip 10.0.0.1
    set extport 77
    set mappedport 10077
next
edit "srv-v4"
    set extip 111.222.0.1
    set extintf "any"
    set mappedip 10.0.0.1
next

As horrible as it looks, it actually works. The result is that clients 1.2.3.4 and 5.6.7.8 connect to port 77, but they actually get DNAT to port 10077. Anyone else connecting to port 77 goes to port 77.

It is worth noting that our original configuration was more awful, and broken. When I originally configured this bit of NAT, I was still learning my way around the FGT. I mistakenly configured the "src-v4-redir" VIP with an extintf of "vlan7", our outside interface. This broke other services using the broader "src-v4" VIP, but in amazingly random ways. All traffic from the outside worked fine. However, we discovered breakage for some users who connect to another service on that VIP from "vlan4"… but only for users coming from some source networks (networks unrelated to the "redid" sources).

In these cases, the traffic would just vanish into the FGT, as confirmed by the sniffer, and flow traces. In the latter case, traffic would fail with the following cryptic trace messages.

fortinet-1a # id=12 trace_id=26 msg="vd-root received a packet(proto=6, 172.25.1.7:52606->111.222.0.1:443) from vlan4."
id=12 trace_id=26 msg="allocate a new session-00e3addb"
id=12 trace_id=26 msg="find SNAT: IP-10.0.0.1(from IPPOOL), port-0"
id=12 trace_id=26 msg="use addr/intf hash, len=13"
id=12 trace_id=26 msg="pre_route_auth check fail(id=0), drop"

After escalating the ticket several times with Fortinet, and two weeks of broken connections (I'm calling you out here Fortinet, two weeks for an answer is unacceptable), we finally got assigned to a foul-mouthed engineer (the best kind). He identified the extintf problem, in between bouts of telling me what a kludgy setup this is…Yes, I know. I don't like it either, and offered a number of possible solutions, with the caveat of "I can't guarantee it, because nobody does this." We tested the change, which annoyingly required us to remove all references to the "redid" VIP, and it worked. Life goes on, I'm pretty satisfied with the FGT.

Thursday, March 13, 2014

Understanding resolvconf behavior on pxe-booted hosts.

I have been learning more than I ever cared to know about resolvconf, and what happens when you use it on a host with a read-only root filesystem. I have been preparing to roll out pxe-boot virtual machines at a second location. The PXE image has the following characteristics.

* NFS-mounted, read-only / filesystem.
* Local writeable disk for swap and /var.
* Unionfs, md-backed /etc (non-persistent, r/w)

I noticed that on the initial boot of a new VM, /etc/resolv.conf would be written correctly. However, all subsequent boots never see the NFS-supplied resolv.conf updated. After a frustrating afternoon of digging, I determined why resolvconf appears to stop working.

Resolvconf stores state data in /var/run/resolvconf. When dhclient is run for an interface, the dhclient-script script calls resolvconf with DNS particulars, resolvconf looks in the interfaces/ sub-directory for an entry named after the interface. If the file does not exist, or does not match the domain/nameserver options received by resolvconf, a new file is written, and appropriate changes are made to /etc/resolv.conf. If the options match what resolvconf already has, no changes are made. The below output shows the contents of the interfaces/ directory on my pxe host.

> ll /var/run/resolvconf/interfaces/
total 8
-rw-r--r--  1 root  wheel  80 Mar 13 18:00 vmx0:dhcp4
-rw-r--r--  1 root  wheel  76 Mar 13 18:00 vmx1

The problem with my pxe hosts lies in the volatile /etc. Every time the host reboots, the modified contents of /etc vanish. /etc/resolv.conf is replaced with the copy from NFS. In my case, this copy reflects the nameservers at the "original" datacenter. Since the state directory for resolvconf exists on the persistent /var, resolvconf sees the old [unchanged] lease data, and assumes everything is peachy with the resolv.conf file.

I don't need the extra features of resolvconf, so I can solve the problem by disabling it. I created a file in the pxe image, /etc/dhclient-enter-hooks, that contains the following.

resolvconf_enable=NO

My initial, more complicated fix, was to create an rc script to re-initialize the resolvconf state directory on every boot. This also worked flawlessly.

#!/bin/sh
#
# Clean out the contents of the resolvconf state directory. Otherwise,
# /etc/resolv.conf never gets updated after the initial boot of a new pxe host.
#
# BEFORE: netif
# AFTER: FILESYSTEMS
# PROVIDE: clean_resolvconf

echo -n "Cleaning out resolvconf state directory: "

/sbin/resolvconf -I

if [ $? ]; then
echo "OK"
else 
echo "FAILED"
fi

Wednesday, February 5, 2014

dhclient exits chroot on FreeBSD 10.0

I've booted my first pxe FreeBSD 10.0 image, and discovered that dhclient, of all things, doesn't seem to work. Running dhclient from the command-line after boot results in the following output:

WARNING dhclient failed to start
chroot
exiting

Some searching turns up a thread on the FreeBSD forums. A missing /var/empty directory is to blame. It should be owned by root, with permissions of 755.

Tuesday, November 12, 2013

Using CARP with VMWare ESXi

If you want to use CARP on your VMWare guest VMs, you will probably find that it doesn't work out of the box. This is due to ESXi rejecting promiscuous mode on the virtual switch by default. To enable promiscuous mode, go to the Network configuration section for the host (in vSphere Client), and click properties for the vSwitch. Edit the properties for the vSwitch, and change the setting of "Promiscuous Mode" to Accept under the "Security" tab.

For bonus points, if you are using NIC Teaming on ESXi (even with just a standby adapter), you will find that your CARP interfaces always remain in BACKUP state, and your logs fill with the following messages.

Nov 12 11:25:51  kernel: carp0: MASTER -> BACKUP (more frequent advertisement received)
Nov 12 11:25:51  kernel: carp0: link state changed to DOWN
Nov 12 11:25:54  kernel: carp0: link state changed to UP

This is because ESXi is rebroadcasting CARP advertisements that come back down the other members of the team. To correct this, you need to dig into the Advanced Settings, under Software. Change Net.ReversePathFwdCheckPromisc to 1. Annoyingly, you will need to reboot the host to affect these changes, but it works.

Wednesday, October 9, 2013

Using BIRD to route over OpenVPN tunnels.

OpenVPN tunnels, good. BIRD routing daemon, great. OSPF on OpenVPN tunnels, headache. The combination of OpenVPN and BIRD routing daemon for OSPF is nothing new for me. I've been using it pretty happily for over a year. However, it always seemed as though something was a bit precarious. As I have started to scale the implementation, I've realized that I needed a rethink of my strategy.

The original design used a /31 subnet on each tunnel, with each endpoint using an address. Logically, it made perfect sense. The catch is that OSPF does not automatically advertise the IP address of the tunnels (a quirk of OpenVPN, BIRD, or both). Traffic would flow through the routers, but the remote tunnel address would be unreachable. I solved this by adding a stubnet x.x.x.x/31 directive to the OSPF  configuration at one of the endpoints. At the time this worked fine.

As I brought a third site online, with multiple VPN tunnels between each site, the original design quickly broke down. The /31s were no longer consistently routable. I remedied this by changing the /31 stubnet to a /32, with each router advertising its own IP addresses. Life was briefly good, until I realized that restarting the OpenVPN tunnels would fail. OpenVPN would log the error, "router-a openvpn_1199[18976]: FreeBSD ifconfig failed: external program exited with error status: 1" The following message was also logged, "ifconfig: ioctl (SIOCAIFADDR): File exists" From the depths of my memory, I recalled that this error occurs when you attempt to configure an interface with an address that already exists in the routing table. Since BIRD was already advertising the /32 host IP, trying to address the tunnel would fail.

Some creative restarting of OpenVPN and BIRD resolved the problem, but is an unacceptable solution for production. I turned to the BIRD mailing list with my situation, and very quickly received a response from one of the BIRD developers. He suggested a number of ideas, the most promising of which is to dedicate a subnet to each router, as a pool from which to draw addresses for local tunnel endpoints on that specific host. The subnet is then configured as a stubnet by BIRD. The ends of a given tunnel may not be remotely close to each other, but this is fine because a tunnel is a point-to-point link. Tunnels can be restarted at will, without contention from existing routes in the table.

The diagram below shows an example topology. Notice that the endpoint of each tunnel is in a completely different network. For example, on the tunnel between router-a and router-b, router-a uses an IP of 192.168.0.0, and router-b uses 10.0.0.0. It doesn't matter, it works just fine. With the given stubnet configured in the bird.conf for each router, all tunnel endpoints are reachable. The network and broadcast addresses for each router's stubnet are also useable, as shown in the example.


Sunday, September 22, 2013

Fixing SSH timeouts on the ASA

I spent a bunch of time getting my head around class maps, policy maps, and service policies, in an effort to correct the issue of idle SSH connections being timed out after an hour (the default idle timeout for TCP connections on the ASA at version 9.1). The documentation is a confusing web of headache, but I found this blog post to be useful reading. My solution was to leave the timeout unchanged, but to enable Dead Connection Detection for SSH connections. Essentially, when an idle SSH connection hits the threshold, the ASA forges a packet to the endpoints to verify that the socket is still open. If there is a response, the idle timer is reset.


access-list ssh_ports remark access list to id ssh traffic for the ssh_ports class map
access-list ssh_ports extended permit tcp any any eq ssh 
access-list ssh_ports extended permit tcp any any eq 2222
class-map ssh_traffic
 description identify SSH traffic, so we can apply policy
 match access-list ssh_ports
policy-map generic_interface_policy
 class ssh_traffic
  set connection timeout dcd 
service-policy generic_interface_policy interface outside
service-policy generic_interface_policy interface inside

Tuesday, September 17, 2013

Cisco ASA Remote Access configuration for Mac OS X

I spent the day fighting to get a remote access IPSec connection set up as follows:

  • ASA 5515-X, running version 9.1.
  • ASA network interfaces are already configured.
  • IPSec clients are assigned addresses from the range 123.0.0.199-201.
  • Client is running OS X 10.8.4 Mountain Lion.
  • Client is using the built-in OS X IPSec client.
  • Client IP is private, behind NAT, with a DHCP-assigned WAN IP.
  • After connecting, client should be able to reach the internal networks 123.0.0.128/26, 123.0.0.192/27.
  • All other traffic is not sent across the VPN.
The following configuration should be added to the ASA:

ip local pool REMOTE_ACCESS_POOL 123.0.0.199-123.0.0.201
management-access inside
access-list REMOTE_ACCESS_SPLIT_TUNNEL remark The corporate network behind the ASA.
access-list REMOTE_ACCESS_SPLIT_TUNNEL standard permit 123.0.0.128 255.255.255.192 
access-list REMOTE_ACCESS_SPLIT_TUNNEL standard permit 123.0.0.192 255.255.255.224 
crypto ipsec ikev1 transform-set REMOTE_ACCESS_TS esp-aes-256 esp-sha-hmac 
crypto dynamic-map REMOTE_ACCESS_DYNMAP 1 set ikev1 transform-set REMOTE_ACCESS_TS
crypto map REMOTE_ACCESS_MAP 1 ipsec-isakmp dynamic REMOTE_ACCESS_DYNMAP
crypto map REMOTE_ACCESS_MAP interface outside
crypto ikev1 enable outside
crypto ikev1 policy 1
 authentication pre-share
 encryption aes-256
 hash sha
 group 2
 lifetime 7200
group-policy REMOTE_ACCESS_GP internal
group-policy REMOTE_ACCESS_GP attributes
 split-tunnel-policy tunnelspecified
 split-tunnel-network-list value REMOTE_ACCESS_SPLIT_TUNNEL
username hunter password **** encrypted
tunnel-group REMOTE_ACCESS_TUNNELGRP type remote-access
tunnel-group REMOTE_ACCESS_TUNNELGRP general-attributes
 address-pool REMOTE_ACCESS_POOL
 default-group-policy REMOTE_ACCESS_GP
tunnel-group REMOTE_ACCESS_TUNNELGRP ipsec-attributes
 ikev1 pre-shared-key *****

For explanation of what all this does, I recommend reading the following Cisco docs. It is worth noting that this configuration does not work with Windows 7/8, which use IKEv2 instead of v1.

The configuration for the built-in OS X IPSec client is described in the following doc. One gotcha I ran into (which is clearly stated in the document) is that the tunnel-group name must be specified in the 'Group Name' field on the Mac. In the case of the above configuration, the group name is REMOTE_ACCESS_TUNNELGRP.

Saturday, September 7, 2013

Using FreeBSD loopback interfaces with BIRD

Why go for the simple solution, when you can first spend hours tearing out your hair in frustration?

I've been working on a new data center deployment, and getting my fingers back into the networking realm; a welcome change. This includes my first OSPFv3 deployment, and we're using BIRD. For the most part, I have been very happy with BIRD for OSPF and BGP; though I have run into some quirks.

The quirk on my mind at this moment is with loopback interfaces. The Cisco way of doing things seems to be to run iBGP sessions between loopback addresses. The rationale is that if you use an interface address, and that interface goes down, your iBGP session goes with it, as that address becomes unreachable. So you use a loopback interface, which is always up. The addresses on the loopback are advertised via an IGP, facilitating the iBGP connection.

For better or worse, I decided to follow the herd, and go with a loopback interface. For IPv4, this was pretty straightforward. Configure the loopback in the OS, add it to bird.conf as a stub interface, and good to go. For reference, here are the bits to do so on FreeBSD.

# /etc/rc.conf
cloned_interfaces="lo1"
ifconfig_lo1="inet W.X.Y.Z/32"

# /usr/local/etc/bird.conf
protocol ospf {
tick 2;
area 0 {
stub no;
interface "vlan7", "vlan500" {
cost 5;
hello 2;
dead 10;
authentication cryptographic;
password "password";
};
interface "lo1", "vlan1001" { stub; };
};
}

And since the proof is in the pudding (or output)...

bird> show route for W.X.Y.Z
W.X.Y.Z/32 via A.B.C.D on vlan500 [ospf1 09:08] * I (150/5) [W.X.Y.Z]

When I went to configure OSPF for our IPv6 allocation, things didn't go quite so smoothly. I used the following similar configuration for the v6 BIRD configuration.

# /etc/rc.conf
ifconfig_lo1_ipv6="inet6 2620:W:X:Y::Z/128"

# /usr/local/etc/bird6.conf
protocol ospf ospf_v6 {
tick 2;
area 0 {
stub no;
interface "vlan7", "vlan500" {
cost 5;
hello 2;
dead 10;
# Authentication is not supported by OSPFv3, supposed to be IPSec AH authenticated.
#authentication cryptographic;
#password "password";
};
interface "lo1", "vlan1001" { stub; };
};
}

With this configuration in place, I realized that my IPv6 loopback address was not being advertised. Examining the logs, BIRD was quite happy to tell me that it had filtered out that route. WTF? After a bunch of time wasted searching Google and throwing shit at the wall, my loopback address was still not working. I finally stumbled on this mailing list thread, where I learned that using a loopback is NOT an expected configuration; at least in the eyes of the developers. Furthermore, the fact that it is working in IPv4 was surprising, and perhaps a bug. The reason that BIRD is denying my lo1 IP is that there is no link-local IP on the interface as well. Without starting a discussion on whether Cisco or BIRD is more right, I'll just say I was *((# *grumble* *f'n BIRD*.

I jumped through a few more hoops, and finally discovered that I could use a tap interface in lieu of a loopback. It would generate a link-local address, OSPF would advertise it, and iBGP happiness filled the kingdom. Nevermind that it is about as hacky as you can get. For what it's worth, here is the rc.conf goo to make it happen.

# /etc/rc.conf
# DON'T USE THIS! IT'S HACKY AND EVERYONE WILL LAUGH AT YOU.
cloned_interfaces="tap0"
ifconfig_tap0_ipv6="inet6 2620:W:X:Y::Z/128 -ifdisabled"

I slept on it. When I woke up, I had some OSPF fixes on my mind for the ASAs (that's another raar story). I was poking around a little more when I read and was reminded about the BIRD stubnet configuration directive. In a nutshell, BIRD will always advertise a stubnet route...perfect! Changed the configuration to support this, and life is good again.

# /etc/rc.conf
ifconfig_lo1_ipv6="inet6 2620:W:X:Y::Z/128"

# /usr/local/etc/bird6.conf
protocol ospf ospf_v6 {
tick 2;
area 0 {
stub no;
interface "vlan7", "vlan500" {
cost 5;
hello 2;
dead 10;
};
interface "vlan1001" { stub; };
stubnet 2620:W:X:Y::Z/128;
};
}

and the proof!

bird> show route for 2620:W:X:Y::Z/128
2620:W:X:Y::Z/128 via fe80::225:90ff:fe6b:f52c on vlan7 [ospf_v6 09:12] * I (150/15) [W.X.Y.Z]

Thursday, June 13, 2013

Configuring NFSv4 on FreeBSD

After much monkeying around, I finally have NFSv4 running on FreeBSD. Since there seems to be a lack of documentation specifically for NFSv4, here is my take on it. The nfsv4 man page does tell you what you need to know in order to get a basic client and server running, is was clear to me only after I had it working. In my setup, my server is running FreeBSD 9.1, and my client runs 9.0.

Server Configuration

  1. In /etc/rc.conf, add the following lines

    nfs_server_enable="YES"
    nfsv4_server_enable="YES"
    nfsuserd_enable="YES"

  2. Start the NFS daemons by running

    /etc/rc.d/nfsd start

  3. Open /etc/exports in your preferred editor. Add a "V4:" line to the file, specifying the root of your NFSv4 tree. There are two choices at this point. Option 1 is to place your NFSv4 root at a location other than the root of the server filesystem. With this option, you can create an arbitrary NFS tree for clients to attach to, independent of how data is actually situated on the filesystem(s). Nullfs mounts may be used to include outside directories in the NFS root. Option 2 is to make the NFSv4 root the actual root of the system. This preserves the behavior of old NFS implementations. Regardless of where you put your V4 root, you must also add export lines, in the same style as NFSv3. The exports man page can be helpful here, and discusses the security implications of using NFSv4.

    # Option 1
    V4: /nfsv4 -network=10.0.0.0 -mask=255.255.255.192
    /nfsv4/ports -maproot=root: -network=10.0.0.0 -mask=255.255.255.192
    
    # Option 2
    V4: / -network=10.0.0.0 -mask=255.255.255.192
    /usr/pxe/ports -maproot=root: -network=10.0.0.0 -mask=255.255.255.192

  4. Reload the exports file by signalling mountd

    killall -HUP mountd
    
    

Client Configuration

Clients should now be able to mount the exported filesystem using the following commands, corresponding to the NFSv4 root options specified above. Notice that with option 1, the remote path omits the /nfsv4 prefix of the server.

# Option 1
mount -t nfs -o nfsv4 server:/ports /mnt

# Option 2
mount -t nfs -o nfsv4 server:/usr/pxe/ports /mnt

Errors

If you get the following error when trying to mount from the client, don't be fooled:

mount_nfs: /mnt, : No such file or directory

This may indicate that you have misspelled the remote path in your mount command. It may also indicate that you have an error in your exports file, or that your exports file is not configured the way you think it is. Go back and read step 3 of the Server Configuration.

Wednesday, October 31, 2012

FreeBSD 9.0 newnfs slow throughput

We recently updated our production systems to FreeBSD 9.0. Almost immediately afterwards, we began getting reports of crumby performance on NFS mounts. After following a lot of false leads, I finally traced the issue to the new NFS code in FreeBSD, appropriately named 'newnfs'. I remounted NFS using the 'oldnfs' filesystem type, and performance immediately went back to its previous levels; a 55% increase by my most conservative testing, with users reporting orders of magnitude improvement for some jobs.

Using old NFS is a band-aid, not a solution. The FreeBSD NFS wizards desire to prune the old code for FreeBSD 10. I set to work, trying to determine the reason for the poor throughput on newnfs. I discovered that newnfs uses a default rsize and wsize value of 65535 (64k). The old NFS implementation uses a 32768 value. Remounting newnfs with the options rsize=32768,wsize=32768 results in performance very similar to that of oldnfs.

Tuesday, September 4, 2012

Migrating to NFSv4 on FreeBSD

I was recently tasked with configuring an existing FreeBSD NFSv3 server to allow NFSv4 mounts. We don't care about v4 security at this point, just that the v4 protocol is being used. The FreeBSD handbook does not yet have any content discussing the configuration of a v4 server, and I was not able to find any good resources on Google. The nfsv4 man pages were informative, but not helpful. Here is the nutshell version of what I finally figured out.

On the NFS server, the following changes need to be made. These notes assume that you already have a working NFSv3 server.

# /etc/rc.conf changes

nfsv4_server_enable="YES"
nfsuserd_enable="YES"

# /etc/exports changes
# The 'V4' line defines the root of your NFSv4 tree. As I
# understand it, the paths you want exported must also be listed
# in the exports file, the same way they are for v3.
V4: /
/home -alldirs -maproot=root: -network=10.0.0.0/24

To mount the filesystem on the client using NFSv4, you must add the nfsv4 option to the mount command.

root@client:~-> mount -t nfs -o nfsv4 server:/home /mnt
root@client:~-> mount
server:/home on /mnt (nfs, nfsv4acls)

Tuesday, June 12, 2012

BackupPC gotchas for PCBSD.

I am getting BackupPC configured to back up a new PCBSD desktop. Every time I do this, I have to relearn what tricks are required, so here is my troubleshooting checklist. Probably applicable to most Linux and Unix-like Operating Systems as well. If you've tried getting this set up before, the useless BackupPC error "Unable to read 4 bytes" has probably become the bane of your existence.

  1. Does the client sshd configuration allow root logins? Check /etc/ssh/sshd_config, specifically the the PermitRootLogin parameter. Unless you absolutely need password authentication for root, use the following configuration: PermitRootLogin without-password
  2. Make sure that the client has the public ssh for the BackupPC user on the server included in /root/.ssh/authorized_keys. If you haven't generated an ssh key for BackupPC, you should do that.
  3. The permissions for /root/.ssh/authorized_keys on the client should be 640. /root/.ssh should be 750. Ownership should be root:wheel.
  4. The .ssh/known_hosts file for the server BackupPC user should contain the fingerprint for the client. If you have connected to the client from the server before as another user (root), the fingerprint may already be in that user's known_hosts file. You can copy it into the BackupPC user's file. If you have not already connected to this host, you can verify the client configuration by connecting with the BackupPC ssh keys like so: ssh -i /path/to/bpc/.ssh/id_rsa client.example.org. The hostname you use (and known_hosts contains) MUST contain the exact hostname BackupPC is configured to connect to. If you connect to 'client' (leveraging your domain search list), but BackupPC is configured to connect to 'client.example.org', the ssh connection will fail.

Pwning the Spotify client

I've been trying to bend the Spotify client to my will for months. I love the service, but the mandatory P2P network traffic generated by the client is so abusive that I can't do my job and listen to music at the same time. You hear that Spotify, your draconian client prevents me from doing my job. I've tried all sorts of tricks to try and block the network traffic, but it's slippery to try and block without impacting legitimate traffic.

I finally decided to try poking at the storage space available to the client, in hopes that I could cut it off at the knees there. I've learned some interesting things (thanks to this blog post for a point in the _right_ direction).

  • The Spotify client for OSX puts the settings file at ~/Library/Application\ Support/Spotify/settings
  • The cache_location parameter controls where the clients tries to put downloaded data. I don't know if parameter position is important in the Spotify configuration file, but the client puts this parameter (for me) between the listen_port and cache_size params.
  • Spotify does not appear to respect the cache_size parameter when it is running, at least not in the short term. I tried setting a cache_size of 1MB, but it appears that the client continuously caches music you listen to. The cache storage directory is reduced to the configured size on client start, apparently.
  • Because of this, the client cannot be contained by changing the location of the cache to a tiny filesystem. I tried using a 20MB Mac disk image as storage; Spotify happily filled the entire image, then stopped playing, complaining about a full drive.
  • When I reduced the cache_size to 1MB and deleted the existing cache, starting the Spotify client produced a message that offline playback is disabled. It remains to be seen if this also means P2P is disabled. Time will tell.
I really wish that Spotify would allow users to control (or disable) the P2P function of the client. I am not opposed to giving a little bit of bandwidth to P2P, but I need to be able to do my job as well. I'm not optimistic though. Although their product is excellent, Spotify seems to be unresponsive to customer requests, and their customer support seems to be dismal, at least in the United States. If anyone at Spotify cares to prove any of this wrong, I would be happy to update/revise this post.

Tuesday, April 10, 2012

/sbin/shutdown permission denied

I have been working on setting up a vmware-based demo environment to be used for a presentation I will be giving in about a month. As a part of the demonstation, I need a lot of nearly identical virtual machines, so PXE-booting them seems like a great idea.

I set up my test NFS server and built a PXE image, following the lead of the FreeBSD handbook, and a PXE setup that we use at $work. After getting my virtual PXE host to boot, I quickly discovered that I was unable to login as root, despite having set a password. After a lot of screwing around, including compiling 3 different FreeBSD source trees, I finally tried booting with the memory filesystems, as described in section 32.8.2 of the handbook. This allowed me to log in, but was still not ideal, since changing anything in /etc required rebuilding the archive. I was also unable to shut down the PXE VM, because calling shutdown returned "permission denied," as root.

Turning to smarter people on IRC, it was suggested that this behavior is usually seen in FreeBSD jails, a nosuid-mounted filesystem, or incorrect permissions on /sbin/shutdown. The second possibility got me to thinking, which led me to the answer. My NFS export on the server was missing the '-maproot=root:' directive. Adding this directive got everything working as expected.

Tuesday, January 24, 2012

Stupid Cacti tricks

I spent some time this weekend consolidating services I have running at home down to one box. I migrated my Cacti installation to its new home (from one FreeBSD jail to another), and moved the mysql server to its own jail. I quickly realized that there was a problem. My graphs weren't working anymore. Some checking and I determined that there was a problem with the rrd files failing because the timestamp of updates was in the past, by six hours, which is precisely the offset of my time zone (CST). I checked all of the time zone configuration on the Cacti jail, and everything was set correctly. I went to bed, frustrated, but figuring that the data would catch up by morning.

Morning came, and my graphs were indeed logging data again, but offset by six hours. The graph window showed the correct time range, but the end of the data line was six hours behind local time. After too much screwing around, reading through the Cacti php. I finally discovered that Cacti sends its poller results to mysql, before retrieving them to put in the rrd files (and purging the records from mysql). The times I was getting back from the new mysql server were being sent as local time, not UTC.

Long story longer, I had neglected to set the time zone in the jail containing mysql.

Friday, April 22, 2011

FreeBSD CARP+BRIDGE+VLAN=BAD

Bridge, good. VLANs, great. CARP, awesome. BRIDGE+VLAN+CARP, pwned. We decided to purchase a pair of atom-based systems for use as the office firewalls. The only thing we weren't completely pleased with was the single Ethernet port. Given that we have been using vlans elsewhere on the network, we didn't expect that there would be any problems using vlans for all interfaces. It turned out to be a big box of pain. In addition to routing problems, we also appear to have aggravated a bug that hangs the system.

VLAN+CARP is a fairly common configuration on firewalls. Our office and DC LANs use the same subnet and are bridged over an OpenVPN tunnel. Trying to incorporate VLAN+CARP into into a bridge seems to cause problems. This diagram illustrates our logical network setup.



After a lot of trial and error, a number of conclusions were drawn.

  • Routing over the bridge doesn't work the same when using vlans. The VPN server pushes a route to our production network when clients connect. When the office firewall was using a physical Ethernet interface for the LAN, this route would refer to the LAN interface as the outgoing interface for this connection. This seems counter-intuitive, but it worked just fine. When the Ethernet LAN interface was replaced with a vlan, the tap (VPN) interface was referenced by the route to production. This seemed to be more logical, except that the Production network became unreachable.
  • After some troubleshooting, I figured out that access to the production network could be fixed by adding a static route to the DC firewall (next-hop to production network) pointing out the tap interface. This seemed to allow traffic to flow smoothly to production.
  • Adding CARP into the above configuration caused the firewall to hang randomly. There seemed to be no indication of a crash, no excessive resource use or network traffic.
  • Routing traffic between tagged vlans and the underlying physical interface may be problematic. This was an earlier configuration I tried, and it seemed to have issues. However, at the time I had not identified CARP as the source of the system hangs, so this may be a non-issue.

The routing issue was reported in a PR that can be found here. The routing tables mentioned above can be found here.

Thursday, December 2, 2010

Deciphering Dell IPMI SNMP Traps

Dell Servers can be configured to send traps when a system even occurs. The following sections discuss how to decipher the SNMP traps. A PowerEdge R300 was used in the examples, and the event discussed is a Chassis Intrusion Alarm.

Useful links

PET Specification
Dell PET Events (MIB)

SNMP Trap OIDs

SNMP Traps from the BMC arrive with the following base OID
.1.3.6.1.4.1.3183.1.1

.1.3.6.1.4.1.3183.1.1.0.x defines the Event type, per the Dell MIB above.
.1.3.6.1.4.1.3183.1.1.1 defines the PET spec information analyzed below.

Based on the Event type OID, you can determine much of what you need to know to generate a nagios trap. In our case,
.1.3.6.1.4.1.3183.1.1.0.356096 indicates an Intrusion event.
.1.3.6.1.4.1.3183.1.1.0.356224 indicates an Intrusion event has been cleared.

PET Analysis

44 45 4C 4C 4B 00 10 4A 80 4B C3 C0 4F 4D 4C 31         1:16    GUID (t3)
00 01 17:18 Seq# 0001
18 4A 74 D5 19:22 Timestamp (seconds from 0:00 1/1/98) 407532757
FF FF 23:24 UTC offset, minutes (0xFFFF unspecified) unspecified
20 25 Trap Source Type IPMI
20 26 Event Source Type IPMI
10 27 Event Severity Critical
20 28 Sensor Device 32
73 29 Sensor Number 115
18 30 Entity 24 (System Chassis)
00 31 Entity Instance (0x0 unspecified) unspecified
80 01 FF 00 00 00 00 00 32:39 Event Data
19 40 Language Code 25
00 00 02 A2 41:44 Manufacturer ID Dell
01 00 45:46 System ID 256?
6C 69 6F 6E 2D 34 2D 69 70 6D 69 C1 47:(110) OEM Custom Fields

Example PET fields

Pipes denote field bounds

                                               |      |           |     |  |  |  |  |  |  |  |      |                 |  |           |     |
44 45 4C 4C 4B 00 10 4A 80 4B C3 C0 4F 4D 4C 31 00 01 18 4A 74 D5 FF FF 20 20 10 20 73 18 00 80 01 FF 00 00 00 00 00 19 00 00 02 A2 01 00 6C 69 6F 6E 2D 34 2D 69 70 6D 69 C1
44 45 4C 4C 4B 00 10 4A 80 4B C3 C0 4F 4D 4C 31 00 05 18 4A 74 EE FF FF 20 20 04 20 73 18 00 80 01 FF 00 00 00 00 00 19 00 00 02 A2 01 00 6C 69 6F 6E 2D 34 2D 69 70 6D 69 C1
44 45 4C 4C 4B 00 10 4A 80 4B C3 C0 4F 4D 4C 31 00 09 18 4A 8E CA FF FF 20 20 10 20 73 18 00 80 01 FF 00 00 00 00 00 19 00 00 02 A2 01 00 6C 69 6F 6E 2D 34 2D 69 70 6D 69 C1
44 45 4C 4C 4B 00 10 4A 80 4B C3 C0 4F 4D 4C 31 00 0D 18 4A 8E F2 FF FF 20 20 04 20 73 18 00 80 01 FF 00 00 00 00 00 19 00 00 02 A2 01 00 6C 69 6F 6E 2D 34 2D 69 70 6D 69 C1


Decoding the PET fields

GUID (16-bytes)

Dell doesn't appear to follow the specification for this field. The first 4 characters are DELL, followed by a string that incorporates part of the Service Tag (CKJKML1). More clarity here could be useful.
44 45 4C 4C 4B 00 10 4A 80 4B C3 C0 4F 4D 4C 31
D E L L K J K O M L 1

Sequence number (2-bytes)

Increasing counter, doesn't appear to be incremental (! +1).

Timestamp (4-bytes)

Odd metric, the number of seconds elapsed since 0:00 1/1/1998 (883612800). Here is some perl that will make a regular timestamp.
$time = localtime(883612800 + 407532757);
print "$time";
Tue Nov 30 13:32:37 2010

Trap Source Type (1-byte)

Table 3 (p.9) in PET spec defines.

Event Source Type (1-byte)

Table 3 (p.9) in PET spec defines.

Event Severity (1-byte)

Table 3 in PET spec defines. 0x10 == Critical, 0x4 == Normal.

Sensor Device (1-byte)

Device ID,

root-> ipmitool sdr list mcloc
BMC | Dynamic MC @ 20h | ok
DRAC 5 | Dynamic MC @ 26h | ok

Sensor Number (1-byte)

The actual sensor ID as known by the BMC. PET spec table 5 (p.13) defines Sensor Types. In the above example, value 0x73 (Chassis Intrustion) falls within the OEM RESERVED range (0xC0-0xFF), even though there is a Physical Security value defined (0x5). Stupid.

root-> ipmitool -v sensor
Sensor ID : Temp (0x1)
Entity ID : 3.1
Sensor Type (Analog) : Temperature
Sensor Reading : Unable to read sensor: Device Not Present

Event Status : Event Messages Disabled
Assertion Events :
Event Enable : Event Messages Disabled
Assertions Enabled :

Sensor ID : Planar Temp (0x7)
Entity ID : 7.1
Sensor Type (Analog) : Temperature
Sensor Reading : 21 (+/- 1) degrees C
Status : ok
Lower Non-Recoverable : na
Lower Critical : 3.000
Lower Non-Critical : 8.000
Upper Non-Critical : 53.000
Upper Critical : 58.000
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lnc- lcr- unc+ ucr+
Deassertions Enabled : lnc- lcr- unc+ ucr+

Sensor ID : Ambient Temp (0x8)
Entity ID : 7.1
Sensor Type (Analog) : Temperature
Sensor Reading : 16 (+/- 1) degrees C
Status : ok
Lower Non-Recoverable : na
Lower Critical : 3.000
Lower Non-Critical : 8.000
Upper Non-Critical : 42.000
Upper Critical : 47.000
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lnc- lcr- unc+ ucr+
Deassertions Enabled : lnc- lcr- unc+ ucr+

Sensor ID : CMOS Battery (0x10)
Entity ID : 7.1
Sensor Type (Discrete): Battery

Sensor ID : VCORE (0x12)
Entity ID : 3.1
Sensor Type (Discrete): Voltage
States Asserted : Digital State
[State Deasserted]

Sensor ID : CPU VTT (0x16)
Entity ID : 7.1
Sensor Type (Discrete): Voltage
States Asserted : Digital State
[State Deasserted]

Sensor ID : 1.5V PG (0x17)
Entity ID : 7.1
Sensor Type (Discrete): Voltage
States Asserted : Digital State
[State Deasserted]

Sensor ID : 1.8V PG (0x18)
Entity ID : 7.1
Sensor Type (Discrete): Voltage
States Asserted : Digital State
[State Deasserted]

Sensor ID : 1.5V Riser PG (0x19)
Entity ID : 16.1
Sensor Type (Discrete): Voltage
States Asserted : Digital State
[State Deasserted]

Sensor ID : FAN MOD 1A RPM (0x30)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6675 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 3525.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 1B RPM (0x31)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6375 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 2325.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 2A RPM (0x32)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6900 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 3525.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 2B RPM (0x33)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6300 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 2325.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 3A RPM (0x34)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6900 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 3525.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 3B RPM (0x35)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6225 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 2325.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 4A RPM (0x36)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6825 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 3525.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 4B RPM (0x37)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : 6150 (+/- 75) RPM
Status : ok
Lower Non-Recoverable : na
Lower Critical : 2325.000
Lower Non-Critical : na
Upper Non-Critical : na
Upper Critical : na
Upper Non-Recoverable : na
Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 5A RPM (0x38)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : Unable to read sensor: Device Not Present

Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 5B RPM (0x39)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : Unable to read sensor: Device Not Present

Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 6A RPM (0x3a)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : Unable to read sensor: Device Not Present

Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : FAN MOD 6B RPM (0x3b)
Entity ID : 7.1
Sensor Type (Analog) : Fan
Sensor Reading : Unable to read sensor: Device Not Present

Assertion Events :
Assertions Enabled : lcr-
Deassertions Enabled : lcr-

Sensor ID : Presence (0x50)
Entity ID : 3.1
Sensor Type (Discrete): Entity Presence
States Asserted : Entity Presence
[Present]

Sensor ID : Presence (0x54)
Entity ID : 10.1
Sensor Type (Discrete): Entity Presence
Unable to read sensor: Device Not Present

Sensor ID : Presence (0x55)
Entity ID : 10.2
Sensor Type (Discrete): Entity Presence
Unable to read sensor: Device Not Present

Sensor ID : Presence (0x56)
Entity ID : 26.1
Sensor Type (Discrete): Entity Presence
States Asserted : Entity Presence
[Absent]

Sensor ID : PFault Fail Safe (0x5f)
Entity ID : 7.1
Sensor Type (Discrete): Voltage
Unable to read sensor: Device Not Present

Sensor ID : Status (0x60)
Entity ID : 3.1
Sensor Type (Discrete): Processor
States Asserted : Processor
[Presence detected]

Sensor ID : Status (0x64)
Entity ID : 10.1
Sensor Type (Discrete): Power Supply
Unable to read sensor: Device Not Present

Sensor ID : Status (0x65)
Entity ID : 10.2
Sensor Type (Discrete): Power Supply
Unable to read sensor: Device Not Present

Sensor ID : Status (0x66)
Entity ID : 16.1
Sensor Type (Discrete): Cable / Interconnect
States Asserted : Cable/Interconnect
[Connected]

Sensor ID : RAC Status (0x70)
Entity ID : 7.1
Sensor Type (Discrete): Module / Board

Sensor ID : OS Watchdog (0x71)
Entity ID : 7.1
Sensor Type (Discrete): Watchdog

Sensor ID : SEL (0x72)
Entity ID : 7.1
Sensor Type (Discrete): Event Logging Disabled
Unable to read sensor: Device Not Present

Sensor ID : Intrusion (0x73)
Entity ID : 7.1
Sensor Type (Discrete): Physical Security

Sensor ID : PS Redundancy (0x74)
Entity ID : 7.1
Sensor Type (Discrete): Power Supply
Unable to read sensor: Device Not Present

Sensor ID : Fan Redundancy (0x75)
Entity ID : 7.1
Sensor Type (Discrete): Fan
States Asserted : Redundancy State
[Fully Redundant]

Sensor ID : CPU Temp Interf (0x76)
Entity ID : 7.1
Sensor Type (Discrete): Temperature
Unable to read sensor: Device Not Present

Sensor ID : Drive (0x80)
Entity ID : 26.1
Sensor Type (Discrete): Drive Slot / Bay
Unable to read sensor: Device Not Present

Sensor ID : Cable SAS (0x90)
Entity ID : 26.1
Sensor Type (Discrete): Cable / Interconnect
Unable to read sensor: Device Not Present

Sensor ID : Cable PDB Ctrl (0x9b)
Entity ID : 7.1
Sensor Type (Discrete): Cable / Interconnect
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
e8 84 ff ff 02 07 d9 9d ea c3 e4 05 21 a5 5d d5
e8 bb f7 a3 c0 82 d0 e8 84 ff ff 02 07 d9 9d ea
c3 e4 05 21 a5 5d d5 e8 bb
Sensor ID : ECC Corr Err (0x1)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
b4 e5 ff ff 02 07 e1 05 e7 29 ae 0a 84 fb aa eb
3b b9 03 d1 bc 49 df b4 e5 ff ff 02 07 e1 05 e7
29 ae 0a 84 fb aa eb 3b b9
Sensor ID : ECC Uncorr Err (0x2)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
ed db ff ff 02 07 a4 9f 92 d0 b0 e2 f7 e1 7d 32
6a d1 4b 5d 2e b5 13 ed db ff ff 02 07 a4 9f 92
d0 b0 e2 f7 e1 7d 32 6a d1
Sensor ID : I/O Channel Chk (0x3)
Entity ID : 34.1
Sensor Type (Discrete): Critical Interrupt
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
4a bb ff ff 02 07 08 02 91 61 be 82 9f 5a c2 fe
06 c7 dd 43 e1 e8 03 4a bb ff ff 02 07 08 02 91
61 be 82 9f 5a c2 fe 06 c7
Sensor ID : PCI Parity Err (0x4)
Entity ID : 34.1
Sensor Type (Discrete): Critical Interrupt
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
2d 73 ff ff 02 07 ef 35 08 e3 a8 d0 20 24 06 f9
c7 8e d2 6b 6e bc de 2d 73 ff ff 02 07 ef 35 08
e3 a8 d0 20 24 06 f9 c7 8e
Sensor ID : PCI System Err (0x5)
Entity ID : 34.1
Sensor Type (Discrete): Critical Interrupt
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
a3 0d ff ff 02 07 4f f0 04 c0 1a 99 af 12 46 1d
74 e9 bf 16 12 0c 13 a3 0d ff ff 02 07 4f f0 04
c0 1a 99 af 12 46 1d 74 e9
Sensor ID : SBE Log Disabled (0x6)
Entity ID : 34.1
Sensor Type (Discrete): Event Logging Disabled
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
cc b9 ff ff 02 07 87 67 27 84 5a b6 f3 f1 82 4a
8b 89 74 67 69 be 11 cc b9 ff ff 02 07 87 67 27
84 5a b6 f3 f1 82 4a 8b 89
Sensor ID : Logging Disabled (0x7)
Entity ID : 34.1
Sensor Type (Discrete): Event Logging Disabled
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
3b c3 ff ff 02 07 3e 85 9a 4c 8f 63 b7 53 73 e4
02 5a 3b 5d 4e 47 73 3b c3 ff ff 02 07 3e 85 9a
4c 8f 63 b7 53 73 e4 02 5a
Sensor ID : Unknown (0x8)
Entity ID : 34.1
Sensor Type (Discrete): System Event
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
f7 35 ff ff 02 07 f6 37 7a ef e8 74 61 e9 71 f9
fc b0 e1 89 d3 f5 a9 f7 35 ff ff 02 07 f6 37 7a
ef e8 74 61 e9 71 f9 fc b0
Sensor ID : CPU Protocol Err (0xa)
Entity ID : 34.1
Sensor Type (Discrete): Processor
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
23 f7 ff ff 02 07 3c 95 d4 23 3a a8 33 05 91 7e
ce 24 73 7c 99 10 8c 23 f7 ff ff 02 07 3c 95 d4
23 3a a8 33 05 91 7e ce 24
Sensor ID : CPU Bus PERR (0xb)
Entity ID : 34.1
Sensor Type (Discrete): Processor
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
09 66 ff ff 02 07 f7 cc ba bf c7 38 50 8b 2f 39
b3 fc 0c 00 72 77 aa 09 66 ff ff 02 07 f7 cc ba
bf c7 38 50 8b 2f 39 b3 fc
Sensor ID : CPU Init Err (0xc)
Entity ID : 34.1
Sensor Type (Discrete): Processor
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
30 89 ff ff 02 07 61 33 4c 17 c3 f1 2d 92 e1 10
57 b6 71 73 93 6a d7 30 89 ff ff 02 07 61 33 4c
17 c3 f1 2d 92 e1 10 57 b6
Sensor ID : CPU Machine Chk (0xd)
Entity ID : 34.1
Sensor Type (Discrete): Processor
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
02 a7 ff ff 02 07 21 d2 89 70 8b 5d 5d 17 8b bb
ae 82 dd 44 ae 4c 51 02 a7 ff ff 02 07 21 d2 89
70 8b 5d 5d 17 8b bb ae 82
Sensor ID : Memory Spared (0x11)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
32 e8 ff ff 02 07 b1 0d 1f 30 f1 92 fe 56 0d c0
4e 65 ea 72 f3 b1 5c 32 e8 ff ff 02 07 b1 0d 1f
30 f1 92 fe 56 0d c0 4e 65
Sensor ID : Memory Mirrored (0x12)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
21 6f ff ff 02 07 a8 30 d8 ec 71 02 00 a4 d1 3f
d3 c9 90 7e 8f 06 60 21 6f ff ff 02 07 a8 30 d8
ec 71 02 00 a4 d1 3f d3 c9
Sensor ID : Memory RAID (0x13)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
c3 d1 ff ff 02 07 e3 44 16 17 2a 90 0e 17 81 bf
e8 08 39 40 ad 72 a0 c3 d1 ff ff 02 07 e3 44 16
17 2a 90 0e 17 81 bf e8 08
Sensor ID : Memory Added (0x14)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
e8 84 ff ff 02 07 3f f7 41 58 00 29 a3 b9 e6 62
96 15 f7 a3 c0 82 d0 e8 84 ff ff 02 07 3f f7 41
58 00 29 a3 b9 e6 62 96 15
Sensor ID : Memory Removed (0x15)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
b4 e5 ff ff 02 07 41 0c a2 b5 d5 ac fa 16 a4 72
84 d7 03 d1 bc 49 df b4 e5 ff ff 02 07 41 0c a2
b5 d5 ac fa 16 a4 72 84 d7
Sensor ID : Memory Cfg Err (0x16)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
ed db ff ff 02 07 07 e8 5a fa 86 ce 6b 76 73 7c
5b aa 4b 5d 2e b5 13 ed db ff ff 02 07 07 e8 5a
fa 86 ce 6b 76 73 7c 5b aa
Sensor ID : Mem Redun Gain (0x17)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
4a bb ff ff 02 07 48 27 41 3c 46 00 c1 02 1a 34
e8 9c dd 43 e1 e8 03 4a bb ff ff 02 07 48 27 41
3c 46 00 c1 02 1a 34 e8 9c
Sensor ID : PCIE Fatal Err (0x18)
Entity ID : 34.1
Sensor Type (Discrete): Critical Interrupt
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
2d 73 ff ff 02 07 39 32 87 7d 45 a2 db 02 9c c5
37 c9 d2 6b 6e bc de 2d 73 ff ff 02 07 39 32 87
7d 45 a2 db 02 9c c5 37 c9
Sensor ID : Chipset Err (0x19)
Entity ID : 34.1
Sensor Type (Discrete): Critical Interrupt
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
a3 0d ff ff 02 07 d7 c6 5d b1 7f 62 43 47 5a 77
de bc bf 16 12 0c 13 a3 0d ff ff 02 07 d7 c6 5d
b1 7f 62 43 47 5a 77 de bc
Sensor ID : Err Reg Pointer (0x1a)
Entity ID : 34.1
Sensor Type (Discrete): Unknown (0xC1)
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
cc b9 ff ff 02 07 64 6e 71 6c 91 87 23 4a 6b fd
f7 68 74 67 69 be 11 cc b9 ff ff 02 07 64 6e 71
6c 91 87 23 4a 6b fd f7 68
Sensor ID : Mem ECC Warning (0x1b)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
3b c3 ff ff 02 07 05 79 96 82 45 b8 12 6c c3 5e
cf f2 3b 5d 4e 47 73 3b c3 ff ff 02 07 05 79 96
82 45 b8 12 6c c3 5e cf f2
Sensor ID : Mem CRC Err (0x1c)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
f7 35 ff ff 02 07 bd 82 5a 26 89 50 fb 7c ab db
db c2 e1 89 d3 f5 a9 f7 35 ff ff 02 07 bd 82 5a
26 89 50 fb 7c ab db db c2
Sensor ID : USB Over-current (0x1d)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
23 f7 ff ff 02 07 6a e7 7e 30 28 75 30 2c 64 c3
d4 5a 73 7c 99 10 8c 23 f7 ff ff 02 07 6a e7 7e
30 28 75 30 2c 64 c3 d4 5a
Sensor ID : POST Err (0x1e)
Entity ID : 34.1
Sensor Type (Discrete): System Firmwares
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
09 66 ff ff 02 07 07 f7 35 2f af 3c a0 41 6f 32
aa b1 0c 00 72 77 aa 09 66 ff ff 02 07 07 f7 35
2f af 3c a0 41 6f 32 aa b1
Sensor ID : Hdwr version err (0x1f)
Entity ID : 34.1
Sensor Type (Discrete): Version Change
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
30 89 ff ff 02 07 95 50 49 39 93 8e 61 72 fa 30
77 07 71 73 93 6a d7 30 89 ff ff 02 07 95 50 49
39 93 8e 61 72 fa 30 77 07
Sensor ID : Mem Overtemp (0x20)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
02 a7 ff ff 02 07 31 99 d7 13 76 96 49 e1 f0 58
dc 00 dd 44 ae 4c 51 02 a7 ff ff 02 07 31 99 d7
13 76 96 49 e1 f0 58 dc 00
Sensor ID : Mem Fatal SB CRC (0x21)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

bridge command response (41 bytes)
32 e8 ff ff 02 07 da 7a 46 57 4c f4 82 17 35 7f
63 8f ea 72 f3 b1 5c 32 e8 ff ff 02 07 da 7a 46
57 4c f4 82 17 35 7f 63 8f
Sensor ID : Mem Fatal NB CRC (0x22)
Entity ID : 34.1
Sensor Type (Discrete): Memory
Unable to read sensor: Device Not Present

Entity (1-byte)

PET spec table 6 (p.17) defines values

Entity Instance (1-byte)

0x0 unspecified

Event Data (8-bytes)

Additional information about the event, as defined in PET spec table 5 (p.13), or in our case, by the OEM..

80 01 FF 00 00 00 00 00

Language Code (1-byte)

Manufacturer ID (4-bytes)

0x2A2 = 674 = Dell
source

System ID (2-bytes)

0x100 = 256 = ???

OEM Custom Fields (<=64-bytes)

Custom fields defined by the OEM.

Tuesday, November 9, 2010

PAM LDAP error: unexpected return value 4?

Are you seeing this error in your logs, along with an inability to log in?

Nov 9 15:05:08 leoger sshd[41524]: in _openpam_check_error_code(): pam_sm_acct_mgmt(): unexpected return value 4
Nov 9 15:05:08 leoger kernel: Nov 9 15:05:08 leoger sshd[41524]: in _openpam_check_error_code(): pam_sm_acct_mgmt(): unexpected return value 4

I did, and I figured out that the problem was caused by having changed my system hostname in config files and DNS, without actually having changed the hostname of the server. Note, the hostname must also be resolvable, either by DNS or /etc/hosts.