Re: panic: Journal overflow

From: Eric Anderson <anderson_at_freebsd.org>
Date: Thu, 26 Apr 2007 07:58:56 -0500
On 04/26/07 07:28, Rong-en Fan wrote:
> On 4/26/07, Eric Anderson <anderson_at_freebsd.org> wrote:
>> On 04/26/07 03:32, Rong-en Fan wrote:
>>> On 4/26/07, Eric Anderson <anderson_at_freebsd.org> wrote:
>>>> On 04/25/07 13:24, Rong-en Fan wrote:
>>>>> This is a i386 current SMP box as of Apr 14 or 15. Got
>>>>> a panic with geom journal.
>>>>>
>>>>> panic: Journal overflow (joffset=3246704015360 active=3246705852416 inactive=324
>>>>> cpuid = 2
>>>>> KDB: stack backtrace:
>>>>> db_trace_self_wrapper(c06b3e2c,e5321944,c04fc86e,c06c70c8,2,...) at db_trace_sel
>>>>> kdb_backtrace(c06c70c8,2,c06ad0d2,e5321950,5,...) at kdb_backtrace+0x2f
>>>>> panic(c06ad0d2,eea3b800,2f3,eebfc000,2f3,...) at panic+0x11f
>>>>> g_journal_check_overflow(c4fc5c00,cb030a00,eb,95428000,eb,...) at g_journal_chec
>>>>> g_journal_flush(c4fc5c00,0,eb,95428000,eb,...) at g_journal_flush+0x60d
>>>>> g_journal_add_current(c4fc5c00,c9cb0948,ca8e818c,c4fc5c00,e5321cbc,...) at g_jou
>>>>> g_journal_release_delayed(c4fc5c00,0,ca8e818c,c4fdadc0,2,...) at g_journal_relea
>>>>> g_journal_flush_send(c4fc5c00,c8521c60,205d0000,1b2,205d4000,...) at g_journal_f
>>>>> g_journal_worker(c4fc5c00,e5321d38,0,0,0,...) at g_journal_worker+0x7f7
>>>>> fork_exit(c04b5813,c4fc5c00,e5321d38) at fork_exit+0x83
>>>>> fork_trampoline() at fork_trampoline+0x8
>>>>> --- trap 0, eip = 0, esp = 0xe5321d70, ebp = 0 ---
>>>>>
>>>>> Sorry, that I don't have core dump available. Will set up next time.
>>>>> After the panic, the system hangs at single user prompt:
>>>>>
>>>>> Trying to mount root from ufs:/dev/da0s1a
>>>>> WARNING: / was not properly dismounted
>>>>> Loading configuration files.
>>>>> kernel dumps on /dev/da0s1b
>>>>> Entropy harvesting: interrupts ethernet point_to_point kickstart.
>>>>> swapon: adding /dev/da0s1b as swap device
>>>>> Starting file system checks:
>>>>> /dev/da0s1a: 1704 files, 56880 used, 196935 free (631 frags, 24538 blocks, 0.2%
>>>>> Can't stat /dev/concat/data.journal: No such file or directory
>>>>> /dev/da0s1f: 237 files, 1208 used, 11877856 free (64 frags, 1484724 blocks, 0.0%
>>>>> GEOM_JOURNAL: Journal concat/data consistent.
>>>>> /dev/da0s1e: UNREF FILE I=548536  OWNER=root MODE=100644
>>>>> /dev/da0s1e: SIZE=717466 MTIME=Apr  4 07:58 2007  (CLEARED)
>>>>> /dev/da0s1e: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED)
>>>>> /dev/da0s1e: SUMMARY INFORMATION BAD (SALVAGED)
>>>>> /dev/da0s1e: BLK(S) MISSING IN BIT MAPS (SALVAGED)
>>>>> /dev/da0s1e: 190105 files, 763850 used, 2281197 free (36557 frags, 280580 blocks
>>>>> /dev/da0s1d: 17507 files, 40158 used, 972857 free (641 frags, 121527 blocks, 0.1
>>>>> THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY:
>>>>>         ufs: /dev/concat/data.journal (/data)
>>>>> Unknown error; help!
>>>>> AEnter full pathname of shell or RETURN for /bin/sh:
>>>>> # mount -a
>>>>> WARNING: R/W mount of /backup denied.  Filesystem is not clean - run fsck
>>>>> mount: /dev/concat/data.journal : Operation not permitted
>>>>> # fsck_ffs -p /dev/concat/data.journal
>>>>>
>>>>> I need to issue 'fsck_ffs -p' myself... any idea about why this happens?
>>>>>
>>>>> The geom journal setup:
>>>>>
>>>>> Geom name: gjournal 68372861
>>>>> ID: 68372861
>>>>> Providers:
>>>>> 1. Name: concat/data.journal
>>>>>    Mediasize: 3246137539072 (3.0T)
>>>>>    Sectorsize: 512
>>>>>    Mode: r1w1e1
>>>>> Consumers:
>>>>> 1. Name: concat/data
>>>>>    Mediasize: 3247211281408 (3.0T)
>>>>>    Sectorsize: 512
>>>>>    Mode: r1w1e1
>>>>>    Jend: 3247211280896
>>>>>    Jstart: 3246137539072
>>>>>    Role: Data,Journal
>>>>>
>>>>> The gconcat consists two scsi disk (actually, it's raid) da0 and da1.
>>>>> Oh no, it panics with journal overflow again while writing this message :(
>>>>>
>>>>> The data.journal is shared by nfs, and there are two boxes that are
>>>>> doing a tar writing operation on this partition.
>>>> You need to change your journal switch and cache switch times, to
>>>> something like this:
>>>>
>>>> kern.geom.journal.force_switch=50
>>>> kern.geom.journal.cache.switch=75
>>>>
>>>> Try that and see if that eases your pain a bit.
>>> This does not help :(
>>> In the past two hours, I tried tuning this two sysctls a bit.
>>> The result is panic over 10 times :(
>>
>> How low did you try them?
> 
> force_swtich = 10, cache.swtich = 50 and cache is 100MB.
> 
>>> I have to remove gjournal on concat/data. I do this following
>>>
>>> gjournal stop concat/data
>>> gjournal clear concat/data
>>> tunefs -J disable /dev/concat/data
>>>
>>> If I want to turn on gjournal someday, can I do it without
>>> recreate the filesystem?
>>
>> Without doing any research, I would say 'sure', because I can't think of
>> a reason why not.  I think just relabeling it, then turning it on via
>> tunefs would do it..
> 
> I read gjournal(8), it seems that I can't unless I use
> another journal provider...


Did you already disable it?


Eric
Received on Thu Apr 26 2007 - 10:58:58 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:09 UTC