Re: [Linux] Linux' ta Scandisk ve Defrag MantÃ½Ã°Ã½

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Attachment view

From: Ã–mer F. USTA (omerusta@gmail.com)
Date: Wed 24 May 2006 - 01:06:26 GMT

Previous message: alper endoğru: "[Linux] Linux' ta Scandisk ve Defrag Mantığı"
In reply to: Ahmet Yildiz: "Re: [Linux] Linux' ta Scandisk ve Defrag Mantığı"

Bu biraz fragmentation denilen olayÄ± anlatÄ±yor ve EXT3 dosya sisteminin neden
daha az daÄŸÄ±ltÄ±ÄŸÄ± konusunu inceliyor ama makale nitelikli olmadÄ±ÄŸÄ± kesin.

[Wftl-lug] Linux file system defrag
Lew Pitcher wftl-lug@salmar.com
Sun, 03 Mar 2002 00:33:49 -0500

    * Previous message: [Wftl-lug] Sendmail for intermittant connections
    * Next message: [Wftl-lug] Disk Copying (imaging) - Question
    * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Here's another one for you, boys and girls...

I frequent about 20 or so Linux and Unix newsgroups, and the question of
linux defrag has come up so often in these groups that I've put together
a stock answer that tries to explain what 'fragmentation' is and what
linux does about it. However, although my explanation is detailed in
some respects, it lacks a lot of information in others. I think that I
need to include more information on (1) how linux filesystems (ext2,
ext3, afs, etc.) manage file data block arrangement, in the light of
'file fragmentation', and what performance exposures _are_ present in
the filesystems.

So, I'm asking for suggestions; does anyone here have a good (simple)
explanation of how our filesystems work, and where their weaknesses are?
I'll take anything I can get, and credit you with the information.

FWIW, what follows is my 'stock defrag' answer; enjoy...

In a single-user, single-tasking OS, it's best to keep all blocks for a
file together, because _most_ of the disk accesses over a given period
of time will be against a single file. In this scenario, the read-write
heads of your HD advance sequentially through the hard disk. In the same
sort of system, if your file is fragmented, the read-write heads jump
all over the place, adding seek time to the hard disk access time.

In a multi-user, multi-tasking, multi-threaded OS, many files are being
accessed at any time, and, if left unregulated, the disk read-write
heads would jump all over the place all the time. Even with
'defragmented' files, there would be as much seek-time delay as there
would be with a single-user single-tasking OS and fragmented files.

Fortunately, multi-user, multi-tasking, multi-threaded OSs are usually
built smarter than that. Since file access is multiplexed from the point
of view of the device (multiple file accesses from multiple, unrelated
processes, with no order imposed on the sequence of blocks requested),
the device driver incorporates logic to accomodate the performance hits,
like reordering the requests into something sensible for the device
(i.e elevator algorithm).

In other words, fragmentation is a concern when one (and only one)
process access data from one (and only one) file. When more than one
file is involved, the disk addresses being requested are 'fragmented'
with respect to the sequence that the driver has to service them, and
thus it doesn't matter to the device driver whether or not a file was
fragmented.

To illustrate:

I have two programs executing simultaneously, each reading two different
files.

The files are organized sequentially (unfragmented) on disk...
[1.1][1.2][1.3][2.1][2.2][2.3][3.1][3.2][3.3][4.1][4.2][4.3][4.4]

Program 1 reads file 1, block 1
                file 1, block 2
                file 2, block 1
                file 2, block 2
                file 2, block 3
                file 1, block 3

Program 2 reads file 3, block 1
                file 4, block 1
                file 3, block 2
                file 4, block 2
                file 3, block 3
                file 4, block 4

The OS scheduler causes the programs to be scheduled and executed such
that the device driver receives requests
                file 3, block 1
                file 1, block 1
                file 4, block 1
                file 1, block 2
                file 3, block 2
                file 2, block 1
                file 4, block 2
                file 2, block 2
                file 3, block 3
                file 2, block 3
                file 4, block 4
                file 1, block 3

Graphically, this looks like...

  [1.1][1.2][1.3][2.1][2.2][2.3][3.1][3.2][3.3][4.1][4.2][4.3][4.4]
}------------------------------>[3.1]
  [1.1]<--------------------------'
    `----------------------------------------->[4.1]
       [1.2]<------------------------------------'
         `-------------------------->[3.2]
                 [2.1]<----------------'
                   `------------------------------->[4.2]
                      [2.2]<--------------------------'
                        `---------------->[3.3]
                           [2.3]<-----------'
                             `------------------------------->[4.4]
             [1.3]<---------------------------------------------'

As you can see, the accesses are already 'fragmented' and we haven't
even reached the disk yet (up to this point, the access have been
against 'logical' addresses). I have to stress this, the above
situation is _no different_ from an MSDOS single file physical access
against a fragmented file.

So, how do we minimize the effect seen above? If you are MSDOS, you
reorder the blocks on disk to match the (presumed) order in which they
will be requested. On the other hand, if you are Linux, you reorder the
_requests_ into a regular sequence that minimizes disk access using
something like an elevator algorithm. You also read ahead on the drive
(optimizing disk access), buffer most of the file data in memory, and
you only write dirty blocks. In other words, you minimize the effect of
'file fragmentation' as part of the other optimizations you perform
on the _access requests_ before you execute them.
Now, this is not to say that 'file fragmentation' is a good thing. It's
just that 'file fragmentation' doesn't have the *impact* here that it
would have in MSDOS-based systems. The performance difference between a
'file fragmented' Linux file system and a 'file unfragmented' Linux
file system is minimal to none, where the same performance difference
under MSDOS would be huge.

Under the right circumstances, fragmentation is a neutral thing, neither
bad nor good. As to defraging a Linux filesystem (ext2fs), there are
tools available, but (because of the design of the system) these tools
are rarely (if ever) needed or used. That's the impact of designing up
front the multi-processing/multi-tasking multi-user capacity of the OS
into it's facilities, rather than tacking multi-processing/multi-tasking
multi-user support on to an inherently single-processing/single-tasking
single-user system.

== And, I'll add Peter T Breuer's <ptb@lab.it.uc3m.es> comments from
== Message-ID: <lo73t9.bdt.ln@news.it.uc3m.es>, posted on
== Wed, 05 Dec 2001 23:52:52 GMT ...

All "fragmented" drives are better than "unfragmented" ones on a
multiuser multitasking o/s. The point is that the machine is doing
many things simultaneously, so it has to jump arround even if one task
is interested in only one file. Tehre will be up to a hundred tasks
doing i/o simultaneously.

Yes, all disk drivers use elevator algorithms, in any o/s.

But to answer your question, ext2s spreads blocks out evenly through
the disk, using various strategies (well, a single mixed strategy)..
This reduces the average seek time on a single elevator pass.

Peter

== And I'll conclude with Eric P. McCoy's <ctr2sprt@yahoo.com> comments
== from Message-ID: <87wv019qqt.fsf@providence.local>, posted on
== Wed, 05 Dec 2001 23:52:52 GMT ...

"Linux filesystems" is a little misleading. e2fs doesn't generally
have fragmentation issues, for certain definitions of "fragmentation."

The short answer is this: e2fs splits the disks up into block groups,
which are contiguous regions of blocks. The group will contain a
certain number of inodes and (data) blocks. When you create an inode,
Linux probably chooses the group with the largest number of free
(data) blocks. When you write to an inode, Linux will preferentially
allocate (data) blocks in the same group as the inode. When it has
to, it will move on to another (later) group, but will still try to
keep the blocks together.

The end result of this is that data is generally fragmented by only a
few blocks, and almost always travels in the same direction. That's
as opposed to the front-to-back fragmentation which could, and
frequently did, occur in FAT and its derivatives.

The above works great until the file system is nearly full, at which
point free blocks are scattered all across the disk is discontiguous
locations. This is why, on a nearly-full file system (above 95% or
so), e2fs performance will degrade _substantially_.

Other file systems (HPFS in particular) are similar, but call groups
"bands" or "stripes" instead. HPFS is actually worse than e2fs when
nearly full, because it uses pseudo B-trees for the directory
structure which periodically need to be rebalanced. The problem there
is that, when the file system is nearly full, directories may need to
be rebalanced into many different groups, which will obviously cause
enormous slowdowns. e2fs uses a crummy, paleolithic array for its
directories, which results in far worse performance overall, but wins
out in this one narrow case (or can, depending on what's done to the
directory).

Sorry, but most people on this group know better than to mention "file
systems" and "explain" in the same sentence when I am around.

Eric McCoy <ctr2sprt@yahoo.com>

-- 
Lew Pitcher
Master Codewright and JOAT-in-training
Registered (Slackware) Linux User #112576 (http://counter.li.org/)
    * Previous message: [Wftl-lug] Sendmail for intermittant connections
    * Next message: [Wftl-lug] Disk Copying (imaging) - Question
    * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 5/24/06, Ahmet Yildiz <ahmyildiz@gmail.com> wrote:
> Symantec adlÄ± bir ÅŸirket tarafÄ±ndan bu defrag  denilen ÅŸey windows iÃ§in
> Ã¼retilmiÅŸ bir yazÄ±lÄ±m olma ihtimali Ã§ok yÃ¼ksek.
>
> _______________________________________________
> Linux mailing list
> Linux@liste.linux.org.tr
> http://liste.linux.org.tr/mailman/listinfo/linux
>
>
>
-- 
Ã–mer FadÄ±l USTA
http://www.bilisimlab.com/

_______________________________________________
Linux mailing list
Linux@liste.linux.org.tr
http://liste.linux.org.tr/mailman/listinfo/linux

Previous message: alper endoğru: "[Linux] Linux' ta Scandisk ve Defrag Mantığı"
In reply to: Ahmet Yildiz: "Re: [Linux] Linux' ta Scandisk ve Defrag Mantığı"

New Message	Reply	About this list	Date view	Thread view	Subject view	Author view	Attachment view

Bu arsiv hypermail 2.1.2 tarafindan uretilmistir.