Logical Informalism
PresidentBarackObama@pdrap.org
Tuesday, 26 April, 2005. 12:31:11 AM

I repaired the disk error on helium and the self-test completed without error.

Monday, 25 April, 2005. 11:32:44 PM

Had a disk error on my 80G drive in my home system.
Found this at: http://www.gra2.com/article.php/20041015232512624


THIS DOCUMENT SHOWS HOW TO IDENTIFY THE FILE ASSOCIATED WITH AN UNREADABLE DISK SECTOR, AND HOW TO FORCE THAT SECTOR TO REALLOCATE.

Assumptions: Linux OS, ext2 or ext3 file system.

Bruce Allen

This document is version $Id: BadBlockHowTo.txt,v 1.4 2004/03/21 20:38:32 ballen4705 Exp $
It is Copyright Bruce Allen (2004) and distributed under GPL2.


Thanks to Sergey Vlasov, Theodore Ts'o, Michael Bendzick, and others for explaining this to me. I would like to add text showing how to do this for other file systems, in particular ReiserFS, XFS, and JFS: please email me if you can provide this information.

In this example, the disk is failing self-tests at Logical Block Address LBA = 0x016561e9 = 23421417. The LBA counts sectors in units of 512 bytes, and starts at zero.

-----------------------------------------------------------------------------------------------
root]# smartctl -l selftest /dev/hda:

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 217 0x016561e9
-----------------------------------------------------------------------------------------------

Note that other signs that there is a bad sector on the disk can be found in the non-zero value of the Current Pending Sector count:
-----------------------------------------------------------------------------------------------
root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 1
-----------------------------------------------------------------------------------------------

First Step: We need to locate the partition on which this sector of the disk lives:
-----------------------------------------------------------------------------------------------
root]# fdisk -lu /dev/hda

Disk /dev/hda: 123.5 GB, 123522416640 bytes
255 heads, 63 sectors/track, 15017 cylinders, total 241254720 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 63 4209029 2104483+ 83 Linux
/dev/hda2 4209030 5269319 530145 82 Linux swap
/dev/hda3 5269320 238227884 116479282+ 83 Linux
/dev/hda4 238227885 241248104 1510110 83 Linux
-----------------------------------------------------------------------------------------------

The partition /dev/hda3 starts at LBA 5269320 and extends past the 'problem' LBA. The 'problem' LBA is offset 23421417 - 5269320 = 18152097 sectors into the partition /dev/hda3.

To verify the type of the file system and the mount point, look in /etc/fstab:
-----------------------------------------------------------------------------------------------
root]# grep hda3 /etc/fstab
/dev/hda3 /data ext2 defaults 1 2
-----------------------------------------------------------------------------------------------
You can see that this is an ext2 file system, mounted at /data.

Second Step: we need to find the blocksize of the file system (normally 4096 bytes for ext2):
-----------------------------------------------------------------------------------------------
root]# tune2fs -l /dev/hda3 | grep Block
Block count: 29119820
Block size: 4096
-----------------------------------------------------------------------------------------------
In this case the block size is 4096 bytes.

Third Step: we need to determine which File System Block contains this LBA. The formula is: b = (int)((L-S)*512/B)
where:
b = File System block number
B = File system block size in bytes
L = LBA of bad sector
S = Starting sector of partition as shown by fdisk -lu and (int) denotes the integer part.

In our example, L=23421417, S=5269320, and B=4096. Hence the 'problem' LBA is in block number b = (int)18152097*512/4096 = (int)2269012.125 so b=2269012.

Note: the fractional part of 0.125 indicates that this problem LBA is actually the second of the eight sectors that make up this file system block.

Fourth Step: we use debugfs to locate the inode stored in this block, and the file that contains that inode:
-----------------------------------------------------------------------------------------------
root]# debugfs
debugfs 1.32 (09-Nov-2002)
debugfs: open /dev/hda3
debugfs: icheck 2269012
Block Inode number
2269012 41032
debugfs: ncheck 41032
Inode Pathname
41032 /S1/R/H/714197568-714203359/H-R-714202192-16.gwf
-----------------------------------------------------------------------------------------------

In this example, you can see that the problematic file (with the mount point included in the path) is: /data/S1/R/H/714197568-714203359/H-R-714202192-16.gwf


To force the disk to reallocate this bad block we'll write zeros to the bad block, and sync the disk:
-----------------------------------------------------------------------------------------------
root]# dd if=/dev/zero of=/dev/hda3 bs=4096 count=1 seek=2269012
root]# sync
-----------------------------------------------------------------------------------------------

NOTE: THIS LAST STEP HAS PERMANENTLY AND IRRETREVIABLY DESTROYED SOME OF THE DATA THAT WAS IN THIS FILE. DON'T DO THIS UNLESS YOU DON'T NEED THE FILE OR YOU CAN REPLACE IT WITH A FRESH OR CORRECT VERSION.


Now everything is back to normal: the sector has been reallocated. Compare the output just below to similar output near the top of this article:
-----------------------------------------------------------------------------------------------
root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 1
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 1
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 1
-----------------------------------------------------------------------------------------------

Note: for some disks it may be necessary to update the SMART Attribute values by using smartctl -t offline /dev/hda

The disk now passes its self-tests again:

-----------------------------------------------------------------------------------------------
root]# smartctl -t long /dev/hda [wait until test completes, then]
root]# smartctl -l selftest /dev/hda

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 239 -
# 2 Extended offline Completed: read failure 90% 217 0x016561e9
# 3 Extended offline Completed: read failure 90% 212 0x016561e9
# 4 Extended offline Completed: read failure 90% 181 0x016561e9
# 5 Extended offline Completed without error 00% 14 -
# 6 Extended offline Completed without error 00% 4 -
-----------------------------------------------------------------------------------------------

and no longer shows any offline uncorrectable sectors:

-----------------------------------------------------------------------------------------------
root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 1
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 1
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
-----------------------------------------------------------------------------------------------


Monday, 25 April, 2005. 06:36:20 PM

In the airport this morning I made a small change to the lexer, and now all 6 of my test scripts are successfully processed by my bash parser.

Thursday, 14 April, 2005. 11:13:54 PM

I ripped out the wxWidgets code that displays the bash script parse tree. Large input files generated large parse trees which wxWidgets choked on. It was drawing bogus lines everywhere. My friend Kevin Geiss warned me about this, and he was right. I've replace it with Qt, which I've used before with good results. No problems with it so far, so I can get back to debugging the parser. Bash words are very complicated to parse, so that's naturally where the bugs are. Most of the code to handle bash words has to be in the lexical analyzer, because Lemon generates LALR(1) parsers. Bash strings definitely don't fit in that constraint.

Thursday, 14 April, 2005. 06:56:35 PM

NY Post, April 14, 2005 -- WHEN U.S. Supreme Court Justice Antonin Scalia (above) spoke Tuesday night at NYU's Vanderbilt Hall, "The room was packed with some 300 students and there were many protesters outside because of Scalia's vitriolic dissent last year in the case that overturned the Texas law against gay sex," our source reports. "One gay student asked whether government had any business enacting and enforcing laws against consensual sodomy. Following Scalia's answer, the student asked a follow-up: 'Do you sodomize your wife?' The audience was shocked, especially since Mrs. Scalia [Maureen] was in attendance. The justice replied that the question was unworthy of an answer."

So, to translate Scalia's position, he's in favor of government prying into people's personal matters, but when he is asked about that exact same personal matter he won't answer the question. What a hypocrite.

Tuesday, 12 April, 2005. 02:47:24 PM

A new version of VMWare is out.

Monday, 04 April, 2005. 02:11:04 PM

I actually wrote this piece of code today. It does exactly what I want, but it's pretty sneaky. It interleaves an if statement and a case statement, in a way inspired by "Duff's Device".




	case ' ':
	case '\t':
	case '\n':
		if (PARSER_STATE -> lf.special_in_handling) {
			w += ch;
			if (ch == '\n') {
				unput (ch);
				done = true;
			}
		} else {
	// Yes, this is exactly what you think it is. It's a case statement
	// interleaved with an if statement. Perfectly legal.
	case '|':
			if ((squoting) || (hquoting) || (backtick) || (backtick2)) {
				cout << "quoting block" << endl;
				w += ch;
				if (ch == '\n') {
					PARSER_STATE -> lf.line_number ++;
				}
			} else if (parmex) {
				cout << "parmex block" << endl;
				parmex = false;
				unput (ch);

			} else {
				cout << "unputting " << (int) ch << endl;		
				unput (ch);
				done = true;
			}
				
		}
		break;


Saturday, 02 April, 2005. 03:30:21 PM

Alex tells me that Kristiana has three new teeth coming in on the top. Up to now, she only had two teeth on the bottom, and those came out several months ago.

Google
 
Web www.pdrap.org