cancel
Showing results for 
Search instead for 
Did you mean: 

Order of pages in a DUMP DATABASE

Former Member
0 Kudos

Hi.

I know if I perform a DUMP DATABASE in an actively modified database, the pages might be written out in an essentially random order, because ASE will use an algorithm that gives priority to a page that a user is wanting to modify just then. Or something.

But what if the database is idle throughout the whole DUMP operation? Will ASE tend to write the pages out in a fixed order, like in sequential order by page number?

I ask because I'm wondering if it's technically feasible to reduce the file transfer size for a full dump, through use of a diff-type algorithm such as rsync. It seems like if the content of one full dump is wildly different from the content of the next one, due to ASE shuffling the order of the pages each time, then such a diff algorithm could not possibly work. The entire dump file would have to be transferred off each time. But what if one dump file was only slightly different from the last one? Would rsync be able to take advantage of that? And I realize that this could possibly make sense only with UNcompressed dumps.

Thanks.

- John.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

John

I know if I perform a DUMP DATABASE in an actively modified database, the pages might be written out in an essentially random order, because ASE will use an algorithm that gives priority to a page that a user is wanting to modify just then. Or something.

Yes.  And Backup Server is much more intelligent than that.

Think about the dump file written in, say, eight stripes.

But what if the database is idle throughout the whole DUMP operation? Will ASE tend to write the pages out in a fixed order, like in sequential order by page number?

AFAIK no order can be determined.  BackupServer maps onto the Shared Memory Segment.  It can read the caches directly.  So if any order can be identified (noting that it would not be a programmable or predictable one), it would be a list of caches, and then physical order (not MRU-LRU, which is not a physical ordering). Plus all the pages that are not in memory, directly from the disk.  There would be order in those.  All spliced with its internal read/write extents algorithm.  Plus another level of splicing if stripes are used.

I ask because I'm wondering if it's technically feasible to reduce the file transfer size for a full dump, through use of a diff-type algorithm such as rsync. It seems like if the content of one full dump is wildly different from the content of the next one, due to ASE shuffling the order of the pages each time, then such a diff algorithm could not possibly work. The entire dump file would have to be transferred off each time. But what if one dump file was only slightly different from the last one? Would rsync be able to take advantage of that?

Now we get to the intent of the question, rather than the question itself.  So what you want is really an efficient way of not storing pages that do not change, in the context of the large volume of full db dump files that are stored, archived, etc, is that correct ? And you are researching methods outside ASE that will reduce that volume.

Bit of background.

  • I had the same need, pressing from a couple of large customers, from about ten years ago.  AFAIC, there was no need to backup pages that did not change.  Think about text and images that never change, and large stable tables that are added to, but never modified. 
  • So I devised an Incremental Backup method.  I will spare you the details, but it was based on sysgams and checksums per AllocUnit/Extent/Page, maintained in both the db and the dump file.
  • In the good old days, ending Apr 2009 to be precise, when I enjoyed a warm relationship with Sybase Engineering, I discussed it with them.  They asked me to submit it with a formal ER, which I did.
  • Engineering looked at it, and a few months later, they informed me that they had come up with an even better method.  No complaint from me.
  • They promised that it will be implemented in ASE 16.
  • The good old days are long gone.  So are the days of TechWaves and product roadmaps.  Everything "went South".  Now with SAP running the show, things appear to be "going North".
  • No one will discuss ASE 16.  No one can, because there is no roadmap or list of intended features.

Next thing I knew was, ASE 15.7 SP100 was delivered, a few months ago.  I always read the New Features Guide for every release, and this one is 154 pages.  No reference to the series of previous minor releases, so this is not a minor release.  It is packed with a number of large new features.  One of them is the promised Incremental Backup, with a full set of features around it.  They have called it Cumulative Backup.  I have not tested that release yet, but the doco is complete (still not synchronised, but let's not beat a dead horse). 

The release is not a minor or minor, minor release.  AFAIC the release should have been named 15.8, if not ASE 16.  Since there is no list of what ASE 16 is, we can't state that it is, or is not ASE 16.  And they keep changing the internal and external release names.  If I were to go by promises made five years ago, and the few posts about the subject in-between, which is all that I have, this is ASE 16.  (Note that I am not a Sybase employee,  I do not speak for Sybase, or wish to appear to do so.  I am speaking for myself, as an old, battle-scarred Sybase hand.)

Please look into ASE 15.7 SP100, the New Features Guide.  What you are seeking, at least on the face of it and unconfirmed, has been delivered, inside ASE.  No need for anything outside ASE, or for post-dump processing.

Cheers

Derek

Former Member
0 Kudos

Hi Derek,

Thanks. I should have added that I'm using ASE 15.0.3. The Cumulative Backup feature sounds cool though.

Now we get to the intent of the question, rather than the question itself.  So what you want is really an efficient way of not storing pages that do not change, in the context of the large volume of full db dump files that are stored, archived, etc, is that correct ? And you are researching methods outside ASE that will reduce that volume.

Yes, I think that is basically correct. I'm wanting to copy a periodic dump file through a slow network, and I want the receiving end to always hold the most recent dump file. Worst case scenario is I can copy the entire dump file each time. Better would be to rely on rsync's ability to copy only the parts of a file that have changed since the last time, thereby minimizing the size of the network transfer..

Thanks.

- John.

Former Member
0 Kudos

Hello Derek,

Roadmaps are here:

service.sap.com/roadmaps

Hans

sladebe
Active Participant
0 Kudos

I'm not sure if anyone will ever read this, but the new ASE roadmap seems to be here:

https://roadmaps.sap.com/board?range=CURRENT-LAST&PRODUCT=67837800100800005166#Q4%202024

(requires an SAP support login to view)

Answers (1)

Answers (1)

Former Member
0 Kudos

I'm using rsync with rsyncable gzip on 8 stripes. It works well for me on ASE 15.0.3.

rsync send around 200M of a 2G rsyncable gzip file.

Former Member
0 Kudos

Furter to my previous post I've decided to give an example of how I do the dump as that might be of assistance.

Example:

$ mkfifo sysprocs.1of2.fifo
$ mkfifo sysprocs.2of2.fifo
$ /usr/local/bin/gzip -1c --rsyncable <sysprocs.2of2.fifo >sysprocs.2of2.dgz &
$ /usr/local/bin/gzip -1c --rsyncable <sysprocs.1of2.fifo >sysprocs.1of2.dgz &

1> dump database sybsystemprocs to 'compress::0::/backup/sysprocs.1of2.fifo'
2> stripe on 'compress::0::/backup/sysprocs.2of2.fifo'
3> go
...
Backup Server: 4.188.1.1: Database sybsystemprocs: 126318 kilobytes (100%) DUMPED.
Backup Server: 3.42.1.1: DUMP is complete (database sybsystemprocs).
1>
[2] +  Done                    /usr/local/bin/gzip -1c --rsyncable <sysprocs.1of2.fifo >sysprocs.1of2.dgz &
[1] +  Done                    /usr/local/bin/gzip -1c --rsyncable <sysprocs.2of2.fifo >sysprocs.2of2.dgz &


Files at remote:
$ ll sysprocs*dgz
-rw-r-----   1 syb1503  sybase   8983974 Sep 25 14:18 sysprocs.1of2.dgz
-rw-r-----   1 syb1503  sybase   9010398 Sep 25 14:18 sysprocs.2of2.dgz

New local files:

$ ll sysprocs*dgz                                                            
-rw-r-----   1 syb1503  sybase   8983209 Sep 25 14:44 sysprocs.1of2.dgz
-rw-r-----   1 syb1503  sybase   9010320 Sep 25 14:44 sysprocs.2of2.dgz

Sending file 1 to remote with rsync:

building file list ...
done
delta-transmission enabled
sysprocs.1of2.dgz
total: matches=2995  hash_hits=3904  false_alarms=0 data=22169

sent 34294 bytes  received 18060 bytes  34902.67 bytes/sec
total size is 8983209  speedup is 171.59

Sending file 2 to remote with rsync:

building file list ...
done
delta-transmission enabled
sysprocs.2of2.dgz
total: matches=2996  hash_hits=3928  false_alarms=0 data=22320

sent 34445 bytes  received 18066 bytes  35007.33 bytes/sec
total size is 9010320  speedup is 171.59

Former Member
0 Kudos

pd123456 dreyer wrote:

I'm using rsync with rsyncable gzip on 8 stripes. It works well for me on ASE 15.0.3.

rsync send around 200M of a 2G rsyncable gzip file.

That's great! It mystifies me how this could work so well though. Doesn't rsync have to depend on an assumption that the source file is sufficiently similar to the destination file? Otherwise it would just have to send the entire file each time. It seems like each of your 8 stripe files would tend to be radically different from one dump to the next, given the "randomness" of how ASE sends pages to the dump output.

Thanks.

- John.

Former Member
0 Kudos

John

That's great! It mystifies me how this could work so well though. Doesn't rsync have to depend on an assumption that the source file is sufficiently similar to the destination file? Otherwise it would just have to send the entire file each time. It seems like each of your 8 stripe files would tend to be radically different from one dump to the next, given the "randomness" of how ASE sends pages to the dump output.

I don't know about mystical, but it certainly brings what I posted, into question.

My info is from a reputable Sybase employee in the very old days.  Anyone with access to the codeline used to, and still should, correct incorrect posts.  BackupServer has changed a lot in two decades.  I used to say that the only value in a dump file is to load, and fast.  Some Engineer may have come upon the idea themselves.  AFAIC, the structure of the dumpfiles should be identical to that of the devices we are loading into.  It should count on the issuer identifying the number of dump threads for the best load into production.

From the evidence, the predictability, it looks like it is doing that, or close to tat.

But I have no idea what Backup Server does these days.

Cheers

Derek

Former Member
0 Kudos

rsyncable gzip info: http://beeznest.wordpress.com/2005/02/03/rsyncable-gzip/

I'm curious as to why you're using rsync/fifo/gzip instead of NFS or some other net file system (or clustered file system).  What benefit do you receive over them?

jason

Former Member
0 Kudos

I create rsyncable gzip files on the local server and then use rsync to send the files to our remote site. The resulting compressed files is almost 17G in size of which rsync only need to send about 1.8G and use a 10th of the time it would take to send the entire compressed file.

The network to the remote site is too slow to consider clustering and replication could not keep up during the end-of-day's bulk processing.

Former Member
0 Kudos

Or Net Backup!