Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Ralf Gross <Ralf-Lists <at> ralfgross.de>
Subject: VolumeToCatalog verify job error message: files are in the Catalog but not on disk
Newsgroups: gmane.comp.bacula.user
Date: Monday 6th August 2007 13:04:50 UTC (over 10 years ago)
Hi,

I've configured verify jobs for my backups. This was working ok for the
last 2
weeks with inc and diff backups. But now I get errors for the full backup
verify jobs.


Terminated Jobs:
 JobId Level Files Bytes Status Finished Name
====================================================================
   415 Full 370,351 66.86 G OK 05-Aug-07 02:13 SMTCZB0003
   416 Full 1,631,258 706.3 G OK 05-Aug-07 09:37 VU0EM003
   417 Full 1 882.1 M OK 05-Aug-07 09:39 BackupCatalog
   418 Volu 61,991 0 Error 05-Aug-07 12:15 VerifySMTCZB0003
   419 Volu 449,341 0 Diffs 05-Aug-07 15:10 VerifyVU0EM003
   420 Volu 1 0 OK 05-Aug-07 15:15 VerifyCATALOG



I ran both verify jobs again this moring with the same results.


client SMTCZB0003:
(client is behind a firewall, Heartbeat Interval = 60 + SDConnectTimeout =
4200 fd settings):

06-Aug 08:19 VU0EM005: Verifying against JobId=415
Job=SMTCZB0003.2007-08-05_00.05.00
06-Aug 08:19 VU0EM005: Bootstrap records written to
/opt/bacula/var/bacula/working/VU0EM005.restore.10.bsr
06-Aug 08:19 VU0EM005:
06-Aug 08:19 VU0EM005: The job will require the following
   Volume(s) Storage(s) SD Device(s)
===========================================================================
06-Aug 08:19 VU0EM005:
06-Aug 08:19 VU0EM005: 06D128L3 NEC-T40A NEC-T40A
06-Aug 08:19 VU0EM005:
06-Aug 08:19 VU0EM005: Start Verify JobId=421 Level=VolumeToCatalog
Job=VerifySMTCZB0003.2007-08-06_08.19.38
06-Aug 08:19 VU0EM005: 3307 Issuing autochanger "unload slot 19, drive 0"
command.
06-Aug 08:21 VU0EM005: 3304 Issuing autochanger "load slot 9, drive 0"
command.
06-Aug 08:22 VU0EM005: 3305 Autochanger "load slot 9, drive 0", status is
OK.
06-Aug 08:22 VU0EM005: 3301 Issuing autochanger "loaded? drive 0" command.
06-Aug 08:22 VU0EM005: 3302 Autochanger "loaded? drive 0", result is Slot
9.
06-Aug 08:22 VU0EM005: Ready to read from volume "06D128L3" on device
"LTO3" (/dev/ULTRIUM-TD3).
06-Aug 08:22 VU0EM005: Forward spacing Volume "06D128L3" to file:block 0:1.
06-Aug 08:23 VU0EM005: End of file 1 on device "LTO3" (/dev/ULTRIUM-TD3),
Volume "06D128L3"
06-Aug 08:25 VU0EM005: End of file 2 on device "LTO3" (/dev/ULTRIUM-TD3),
Volume "06D128L3"
[...]
06-Aug 09:45 VU0EM005: End of file 58 on device "LTO3" (/dev/ULTRIUM-TD3),
Volume "06D128L3"
06-Aug 09:46 VU0EM005: New file:
/home/mksadmin/MKS/IntegrityServer/mksis/bulk/^home/mksadmin/Projects/ERW_Projekte/LEGO/Library/Tools/csd/basics/std/docu/en/html/rcs/node2.html/1.5
06-Aug 09:46 VU0EM005:
06-Aug 09:46 VU0EM005: The following files are in the Catalog but not on
disk:
06-Aug 09:46 VU0EM005: /home/mksadmin/.bash_history
06-Aug 09:46 VU0EM005: /home/mksadmin/.xinitrc
06-Aug 09:46 VU0EM005: /home/mksadmin/.Xauthority
[...several more hundred lines...]
06-Aug 09:46 VU0EM005: VerifySMTCZB0003.2007-08-06_08.19.38 Fatal error:
bnet.c:241 Packet size too big from "File daemon:53.115.208.20:9102.
Terminating connection.
06-Aug 09:53 VU0EM005: VerifySMTCZB0003.2007-08-06_08.19.38 Fatal error:
read.c:139 Error sending to File daemon. ERR=Datenübergabe unterbrochen
(broken pipe)
06-Aug 09:53 VU0EM005: VerifySMTCZB0003.2007-08-06_08.19.38 Error:
bnet.c:439 Write error sending 65536 bytes to client:53.115.208.20:36643:
ERR=Datenübergabe unterbrochen (broken pipe)
06-Aug 09:53 VU0EM005: VerifySMTCZB0003.2007-08-06_08.19.38 Fatal error: No
Job status returned from FD.
06-Aug 09:53 VU0EM005: VerifySMTCZB0003.2007-08-06_08.19.38 Error: Bacula
2.0.3 (06Mar07): 06-Aug-2007 09:53:55
  JobId: 421
  Job: VerifySMTCZB0003.2007-08-06_08.19.38
  FileSet: SMTCZB0003
  Verify Level: VolumeToCatalog
  Client: SMTCZB0003
  Verify JobId: 415
  Verify Job: SMTCZB0003
  Start time: 06-Aug-2007 08:19:40
  End time: 06-Aug-2007 09:53:55
  Files Expected: 370,351
  Files Examined: 321,504
  Non-fatal FD errors: 0
  FD termination status: Error
  SD termination status: Error
  Termination: *** Verify Error ***




BTW: I'm wondering about:

06-Aug 09:46 VU0EM005: The following files are in the Catalog but not on
disk:
06-Aug 09:46 VU0EM005: /home/mksadmin/.bash_history".

During this verify job I noticed high network traffic between the dir and
fd.
Is this normal? I thought VolumeToCatalog compares the meta data saved on
tape
with the meta data in the catalog and does not compare the actual data on
disk?

Second thing:

06-Aug 09:46 VU0EM005: New file:
/home/mksadmin/MKS/IntegrityServer/mksis/bulk/^home/mksadmin/Projects/ERW_Projekte/LEGO/Library/Tools/csd/basics/std/docu/en/html/rcs/node2.html/1.5

What are these  characters about? The file is with the correct name in
the db and on tape!


The file /home/mksadmin/.bash_history (I guess all files of jobid 415) was
backed up correctly. There were no errors during backup. I can find the
filename in the db and restore it from tape.  

database:

Enter path with trailing slash: /home/mksadmin/
Enter filename: .bash_history
Enter Client name: SMTCZB0003
+-------+---------------------+------------+-------+------------+
| jobid | jobstarttime | volumename | level | clientname |
+-------+---------------------+------------+-------+------------+
| 415 | 2007-08-05 00:07:06 | 06D128L3 | F | SMTCZB0003 |
[...]


restore:

Enter JobId(s), comma separated, to restore: 415
You have selected the following JobId: 415

Building directory tree for JobId 415 ... 
+++++++++++++++++++++++++++++++++
1 Job, 307,361 files inserted into the tree.
[...]
cwd is: /
$ cd /home/mksadmin
cwd is: /home/mksadmin/
$ ls .bash_history
.bash_history

The restore job finished without errors. So the file is in the db and on
tape
but I get an error message during verify.




Client VU0EM003:
(client not behind a firewall in the same subnet as the the dir)

06-Aug 10:25 VU0EM005: Verifying against JobId=416
Job=VU0EM003.2007-08-05_00.05.01
06-Aug 10:25 VU0EM005: Bootstrap records written to
/opt/bacula/var/bacula/working/VU0EM005.restore.11.bsr
06-Aug 10:25 VU0EM005:
06-Aug 10:25 VU0EM005: The job will require the following
   Volume(s) Storage(s) SD Device(s)
===========================================================================
06-Aug 10:25 VU0EM005:
06-Aug 10:25 VU0EM005: 06D128L3 NEC-T40A NEC-T40A
06-Aug 10:25 VU0EM005: 06D138L3 NEC-T40A NEC-T40A
06-Aug 10:25 VU0EM005:
06-Aug 10:25 VU0EM005: Start Verify JobId=422 Level=VolumeToCatalog
Job=VerifyVU0EM003.2007-08-06_10.25.16
06-Aug 10:25 VU0EM005: 3307 Issuing autochanger "unload slot 19, drive 0"
command.
06-Aug 10:26 VU0EM005: 3304 Issuing autochanger "load slot 9, drive 0"
command.
06-Aug 10:26 VU0EM005: 3305 Autochanger "load slot 9, drive 0", status is
OK.
06-Aug 10:26 VU0EM005: 3301 Issuing autochanger "loaded? drive 0" command.
06-Aug 10:26 VU0EM005: 3302 Autochanger "loaded? drive 0", result is Slot
9.
06-Aug 10:26 VU0EM005: Ready to read from volume "06D128L3" on device
"LTO3" (/dev/ULTRIUM-TD3).
06-Aug 10:26 VU0EM005: Forward spacing Volume "06D128L3" to file:block
68:0.
06-Aug 10:28 VU0EM005: End of file 69 on device "LTO3" (/dev/ULTRIUM-TD3),
Volume "06D128L3"
[...]
06-Aug 11:03 VU0EM005: VerifyVU0EM003.2007-08-06_10.25.16 Fatal error:
read.c:139 Error sending to File daemon. ERR=Die Wartezeit für die
Verbindung ist abgelaufen
06-Aug 11:03 VU0EM005: VerifyVU0EM003.2007-08-06_10.25.16 Error: bnet.c:439
Write error sending 65536 bytes to client:10.60.1.252:36643: ERR=Die
Wartezeit für die Verbindung ist abgelaufen
[...the original verify job showed these errors after Note: 0000002 hours,
I just didn't want to wait these 2 hours again, so it's juts cut & paste]
VU0EM003: VerifyVU0EM003.2007-08-06_10.25 Fatal error: verify_vol.c:108
Data record error. ERR=Connection reset by peer
VU0EM005: The following files are in the Catalog but not on disk:



I a bit lost what is going wrong here and what to check next. This did not
happen with verify jobs that were run against inc or diff jobids.


I changed the Heartbeat Interval for the SD from 10min to 5min and
restarted
the bacula-sd. Then I started the the first verify job again. This time I
got
only one message about missing files (/boot).

Right after that I also reran the verify for the second job, but this job
almost imediatly failed with an error. I could reproduce this 3 times in a
row.

06-Aug 14:02 VU0EM005: Verifying against JobId=416
Job=VU0EM003.2007-08-05_00.05.01
06-Aug 14:02 VU0EM005: Bootstrap records written to
/opt/bacula/var/bacula/working/VU0EM005.restore.15.bsr
06-Aug 14:02 VU0EM005:
06-Aug 14:02 VU0EM005: The job will require the following
   Volume(s) Storage(s) SD Device(s)
===========================================================================
06-Aug 14:02 VU0EM005:
06-Aug 14:02 VU0EM005: 06D128L3 NEC-T40A NEC-T40A
06-Aug 14:02 VU0EM005: 06D138L3 NEC-T40A NEC-T40A
06-Aug 14:02 VU0EM005:
06-Aug 14:02 VU0EM005: Start Verify JobId=426 Level=VolumeToCatalog
Job=VerifyVU0EM003.2007-08-06_14.02.42
06-Aug 14:04 VU0EM005: Ready to read from volume "06D128L3" on device
"LTO3" (/dev/ULTRIUM-TD3).
06-Aug 14:04 VU0EM005: Forward spacing Volume "06D128L3" to file:block
68:0.
06-Aug 14:05 VU0EM005: End of file 69 on device "LTO3" (/dev/ULTRIUM-TD3),
Volume "06D128L3"
06-Aug 14:06 VU0EM005: New file:
/usr/src/linux-headers-2.6.18-3-amd64/include/config/flat/node/mem/
06-Aug 14:06 VU0EM005:
06-Aug 14:06 VU0EM005: The following files are in the Catalog but not on
disk:
06-Aug 14:06 VU0EM005:
/server/wikis/farm/erw-test-wiki/data/pages/TestSnapArcGetEdgeRadius/revisions/00000001
06-Aug 14:06 VU0EM005:
/server/wikis/farm/erw-test-wiki/data/pages/TestSnapArcGetEdgeRadius/revisions/00000002
[...]
06-Aug 14:06 VU0EM005:
/server/projekte/ERW/Freigaben/Freigaben/SRS110_BR216/Freigaben/V0263_Serienfreigabe/HiL/Errormanagement_ABA_CMS_PAS_V0263_051006_Test_291106/m148_int-dtr-03_t3_/Auswert_Daten/m148_INT-DTR-03_t3_-06-12-0
1-10-25-02.rar
06-Aug 14:06 VU0EM005: VerifyVU0EM003.2007-08-06_14.02.42 Fatal error:
bnet.c:241 Packet size too big from "File daemon:10.60.1.252:9102.
Terminating connection.
06-Aug 14:06 VU0EM005: VerifyVU0EM003.2007-08-06_14.02.42 Fatal error: No
Job status returned from FD.
06-Aug 14:08 VU0EM005: VerifyVU0EM003.2007-08-06_14.02.42 Error: Bacula
2.0.3 (06Mar07): 06-Aug-2007 14:08:12
  JobId: 426
  Job: VerifyVU0EM003.2007-08-06_14.02.42
  FileSet: VU0EM003
  Verify Level: VolumeToCatalog
  Client: VU0EM003
  Verify JobId: 416
  Verify Job: VU0EM003
  Start time: 06-Aug-2007 14:02:44
  End time: 06-Aug-2007 14:08:12
  Files Expected: 1,631,258
  Files Examined: 34,757
  Non-fatal FD errors: 0
  FD termination status: Error
  SD termination status: Error
  Termination: *** Verify Error ***

06-Aug 14:08 VU0EM005: message.c:481 Mail prog: bsmtp: bsmtp.c:92 Fatal
malformed reply from localhost: 552 5.3.4 Error: message file too big
06-Aug 14:08 VU0EM005: VerifyVU0EM003.2007-08-06_14.02.42 Error:
message.c:492 Mail program terminated in error.
CMD=/opt/bacula/sbin/bsmtp -h localhost -f "(Bacula) [email protected]" -s
"Bacula: Verify Fatal Error of VU0EM003 Verify Volume to Catalog"
[email protected] ERR=Child exited with code 1



I've np problems with backups, the inc and diff verifies were ok for the
last 2
weeks. But this two full backup verify jobs are always failing.

Ralf





-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
 
CD: 3ms