cancel
Showing results for 
Search instead for 
Did you mean: 

SAP on SLES with BtrFS

markus_doehr2
Active Contributor
0 Kudos

Hi all,

is someone using SLES system with BtrFS (either / or database or both)? I'd like to hear (and share) experiences.

Regards,

Markus

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi Markus,

we are running btrfs for / since SLES11 SP2 on over 60 Systems.

In the meantime we only have SP3 / SP4 mixed ....

No problems so far.

Regards,

Daniel

markus_doehr2
Active Contributor
0 Kudos

Hi Daniel,

thank you for your input.

We run/ran roughly 80 systems on BtrFS and there seems to be a regression in Kernels > 3.0.101-0.29 that may corrupt the filesystem. SuSE is still investigating.

Seven systems have so far crashed with filesystem errors where the database had to be restored from backup, including two times our central BW (1,5+ TB), two systems could not be restored completely because the database online logs filesystem was hosed. Those crashes happened mostly under no to very little load. Older kernels do not have this problem, they run flawlessly.

Which kernel versions do you use?

Markus

Former Member
0 Kudos

Hi Markus,

We running the newest SP4 Kernels:

Linux  3.0.101-63-default #1 SMP Tue Jun 23 16:02:31 UTC 2015 (4b89d0c) x86_64 x86_64 x86_64 GNU/Linux

But we don't have databases on btrfs, only / (os installation) ,,, all SAN data are on ext3 since we use snapshots of the storage system.

Best regards, Daniel

markus_doehr2
Active Contributor
0 Kudos

I see.

We had one machine with also BtrFS on /, the database crashed, wrote a dump, filled the root filesystem and the whole system rebooted. The root filesystem was no more mountable (filesystem full), a balancing did not work.

Markus

Former Member
0 Kudos

For this problem there is one solution:

Boot an other / newer live linux like gentoo boot cd,

add one usb stick or other block device  to the fs, delete snaps or data, then shrink the fs to

the orginal device only.

You must use an ohter linux since add / remove devices to a

btrfs is disabled in SLES.

Newer Kernels have also a protection (reserve some space),

so its allways mountable (even if its full) and you can allways

delete files.( Remember, Deletion of Files generates new Metadata ),

but i don't know if suse has backported this to ..

I personally like btrfs, but is has some edges you must know 😉

Best regards,

Daniel

markus_doehr2
Active Contributor
0 Kudos

Thank you Daniel.

We have eight broken systems now, mainly showing kernel oopses as the following - and marking the filesystem read only. In this case it's was "just" /usr/sap but we had other occurences, where it was the filesystem that holds the database data or log files. In that case the filesystem is broken and one has to restore from a backup.

[   39.497688] WARNING: CPU: 5 PID: 3145 at ../fs/btrfs/super.c:259 __btrfs_abort_transaction+0x4b/0x120 [

btrfs]()

[   39.497690] BTRFS: Transaction aborted (error -5)

[   39.497692] Modules linked in: iscsi_ibft iscsi_boot_sysfs af_packet btrfs xfs libcrc32c nls_iso8859_1

nls_cp437 raid6_pq xor vfat fat vmw_balloon coretemp ppdev crc32c_intel vmxnet3 vmw_vmci shpchp parport_pc

pcspkr i2c_piix4 serio_raw processor battery ac parport efivars button efivarfs ext4 crc16 mbcache jbd2 v

mwgfx ttm drm floppy sr_mod cdrom sd_mod ata_generic ata_piix ahci libahci libata vmw_pvscsi dm_mirror dm_

region_hash dm_log dm_mod sg scsi_mod autofs4

[   39.497743] Supported: Yes

[   39.497747] CPU: 5 PID: 3145 Comm: sapstartsrv Not tainted 3.12.44-52.10-default #1

[   39.497750] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B6

4.1410210136 10/21/2014

[   39.497754]  ffffffffa06b5550 ffffffff81510581 ffff8807c4b21ad8 ffffffff81055362

[   39.497759]  ffff8808147ffa28 ffff8807c4b21b28 00000000fffffffb ffffffffa06b3e50

[   39.497764]  00000000000016b2 ffffffff810553ec ffffffffa06b8c88 0000000000000020

[   39.497769] Call Trace:

[   39.497791]  [<ffffffff8100471d>] dump_trace+0x7d/0x2d0

[   39.497798]  [<ffffffff81004a04>] show_stack_log_lvl+0x94/0x170

[   39.497804]  [<ffffffff81005e31>] show_stack+0x21/0x50

[   39.497812]  [<ffffffff81510581>] dump_stack+0x41/0x51

[   39.497821]  [<ffffffff81055362>] warn_slowpath_common+0x82/0xc0

[   39.497829]  [<ffffffff810553ec>] warn_slowpath_fmt+0x4c/0x50

[   39.497844]  [<ffffffffa060dc0b>] __btrfs_abort_transaction+0x4b/0x120 [btrfs]

[   39.497883]  [<ffffffffa062065f>] __btrfs_free_extent+0x30f/0xc40 [btrfs]

[   39.497930]  [<ffffffffa0625ad2>] __btrfs_run_delayed_refs+0x912/0x11d0 [btrfs]

[   39.497981]  [<ffffffffa062a459>] btrfs_run_delayed_refs.part.66+0x69/0x280 [btrfs]

[   39.498037]  [<ffffffffa063c40d>] __btrfs_end_transaction+0x2ad/0x3d0 [btrfs]

[   39.498113]  [<ffffffffa0645629>] btrfs_truncate+0x1e9/0x2b0 [btrfs]

[   39.498195]  [<ffffffffa0646100>] btrfs_setattr+0x230/0x2e0 [btrfs]

[   39.498266]  [<ffffffff811bc6e1>] notify_change+0x231/0x390

[   39.498275]  [<ffffffff8119fca5>] do_truncate+0x65/0x90

[   39.498283]  [<ffffffff8119ffff>] do_sys_ftruncate.constprop.11+0x11f/0x180

[   39.498294]  [<ffffffff8151e789>] system_call_fastpath+0x16/0x1b

[   39.498302]  [<00007ffff5e3fa97>] 0x7ffff5e3fa96

[   39.498305] ---[ end trace 4280fc12485ab7b5 ]---

Those problems seem to occur really randomly, in most of the cases they happen under no load so when the system is just sitting there.

They all appeared when we used kernels of SLES 11 SP3 > 3.0.101-0.29, the most of them with the latest kernel 0.55 but also with SLES12 (as you can see here).

Markus

Answers (1)

Answers (1)

fabian_herschel
Participant
0 Kudos

With SLES12 btrfs is the default file system for /. This is part of the idea to get a rollback functionality for the *system*, for example after a failed system update.

The following tutorial session from SUSECon2014 explains a bit the ideas, concepts, requirements and limits. It's about "myth and truth".

http://www.susecon.com/doc/2014/sessions/TUT5802.pdf

Hope that helps a bit to differ between the SLES11 (SP3) and SLES12  feature set.

markus_doehr2
Active Contributor
0 Kudos

Hi Fabian,

thank you for sharing.

I know that and snapshot functionality is just what I planned to use (pre-downtime on upgrades etc.) instead of storage system snapshots.

Unfortunately seven (7) systems crashed with various BtrFS related filesystem errors in the last five months, we even had data los on two (smaller) systems because the log and log mirror filesystem was hosed (on BtrFS).

One system had a root filesystem with LVM and BtrFS and was "full", rebalancing didn't work because there was allegedly "no space available", so eventually we also had to restore that system from a backup (that one was on SLES12). The SuSE support could also not help us.

It may work, our experiences just show it's not really stable, neither for root nor for application or database data, hence we migrated all our instances (60+) away from BtrFS and use ext3 and xfs now.

--

Markus

fabian_herschel
Participant
0 Kudos

I hope you informed SUSE support about your migration from BtrFS to ext3, so they could communicate your bad experiences with the developement. Internally I try a research on this issue.
If interested, please send me a direct eMail, because I could not see your contact data here in SCN. Could be needed that I know your company name to reference your issue.

markus_doehr2
Active Contributor
0 Kudos

Hi Fabian,

yes, there are a few SR's open with SuSE, we got a kernel who should help in finding out the problem but since we migrated all systems away already in the last weeks (I literally had no weekend in 2 months) we don't have a system on BtrFS any more and hence we can't implement the kernel to see if it helps narrowing down the original issue.

Our SAP customer no. is 36620, if that may help. You can also check the Novell SRs 10954983661 and 10964582722.

Another problem with BtrFS as root filesystem is the fact, that a DB2 installation expects a real "filesystem" for /tmp and not a submount. An OSS call (560294/2015) stated, that BtrFS as /tmp submount for DB2 installation is not supported.

Regards,

Markus