Harbour SORT TO refeito!

Mensagem por **Itamar M. Lins Jr.** » 20 Mar 2015 13:08

2015-03-20 16:34 UTC+0100 Przemyslaw Czerpak (druzus/at/poczta.onet.pl)
* include/Makefile
- include/hbdbsort.h
* src/rdd/Makefile
- src/rdd/hbdbsort.c
* src/rdd/dbf1.c
- completely removed old code used in DBF* RDDs as low level backend
for SORT TO ... command and __dbArrange() function.
It was buggy and extremely inefficient in some cases, i.e. after
20 hours I killed process which was sorting table with 2'000'000
records in reverted order so I cannot even say how much time it
needed.

* src/rdd/dbf1.c
+ added new code for table sorting in DBF* RDDs
(SORT TO ... / __dbArrange() backend)
New code fixes many different problems which existed in previous one
like missing support for national collation, wrong descending orders,
wrong sorting of numeric fields with negative values, missing support
for sorting many field types, missing support for transferring MEMO
fields, missing support for transferring records to table with different
field structure or serving by different RDD, etc.
New code is also many times faster then the old one. In practice it
means is now usable for tables with more then few thousands records,
i.e. the test table with 2'000'000 records I used with old code was
copied in sorted order in 13 secs. when pure COPY TO needed 10 secs.
Now it's possible to export sorted tables to different RDDs, i.e.
using DELIM or SDF RDDs in desitnation area.
New code is written in general form without any local to DBF* RDDs
extensions so it can be adopted as base in any other RDD. Probably
I'll move it to default workarea methods so it will be inherited by
all Harbour RDDs and only if necessary authors of some RDDs may
overload it, i.e. to move the operation to the server side in remote
RDDs when source and destination tables are processed by the same
server.
This modification closes the last known for years bug or rather bag
of bugs in Harbour.
; NOTE: For large tables new sorting algorithm may access source records
more then once. It means that results may be wrongly sorted when
sorted fields in exported records are modified concurrently by
other station during exporting. This can be easy eliminated by
copping source records to temporary file anyhow it will introduce
additional overhead in all cases and user can easy eliminate the
problem by simple FLOCK before sort or making export to temporary
file and then sorting this file or he can simply ignore this
problem as unimportant in the specific situation so I decided to
not implement double copping.
I haven't tested what Cl*pper exactly does in such case so
I cannot say if current behavior is or isn't Cl*pper compatible.

best regards
Przemek

2.000.000 milhões de records. Era mais de 20h e caiu para apenas 13 segundos!
COPY TO p/ 10 segundos!

Saudações,
Itamar M. Lins Jr.

Pablo César · Mensagem por **Pablo César** » 20 Mar 2015 14:17

Nossa o homem é fera mesmo !

Obrigado Itamar por destacar