by Dr. Peter Wayne
At the recent Alpha Five conference, Selwyn Rabins demonstrated a new "Supa-De-Dupa" utility to delete duplicate table entries. Unlike the current duplicate entry deleter, this one enables you to compare both entries before deciding which one to delete. I don't want to steal anything from version 6, but I thought I would pick up my keyboard and cobble together a little utility that should keep you going until Alpha Software releases v6.
Here is a typical view of a personnel table:
Figure 1. View of personnel table.
Notice all the typical errors. Finian is in "development" and "dev.", Jeff is in "customr support" and "customer support" and "cust. support". Our duplicate deletion utility will enable us to choose the table to check, then choose the fields to use to determine if 2 records might be duplicates, then display other fields that we want to display to verify whether the records are truly duplicates, and then finally decide which record to delete. Here it is in action:
Figure 2. First we choose a table from all the tables in the database.

Figure 3. Clicking on a table shows us the fields that can be checked for duplicates.
Fig. 4. In this simple example, I have chosen to check the "Name" field for duplicates, but also to display the department and identifiers of each employee in the personnel file.
I then click on "OK" and see this dialog:
Figure 5. This dialog shows that "finian" is in both "development" and "dev." His employee identifier is listed as 00004 in one place and 000012 in the other, and the record numbers are record number 3 and record number 11. No one said this was anything but a utility, mind you!
Once I select one of the duplicates, the "Delete" button becomes activated:

Fig. 6. Select an entry to enable deletion.
The code creates 2 Xdialogs. The first one is shown in Figs. 1-4, and the second is displayed in Figs. 5 & 6. Here is the code, with comments:
'Date Created: 13-Apr-2003 08:55:08 AM
'Last Updated: 20-Apr-2003 09:44:17 PM
'Created By : Peter Wayne
'Updated By : Peter Wayne
'dupe_check
' Limitations: This script creates an index on the fields to be compared. Therefore it cannot
' compare field types, such as memos or blobs, that cannot be indexed. In addition, although
' it can compare multiple fields, if the length of all the fields to be compared is greater
' than the maximum length of an A5 index key, the program won't be able to create the index.
'
' get list of tables
tables=a5.table_enum()
tables=sortsubstr(tables,crlf())
dlg_title="Duplicate checker"
table_choice=""
dim fields as c
dim other_fields as c
' here comes first xdialog (modal)
result=ui_dlg_box(dlg_title,<<%dlg%
{can_exit=on_exit}
{region=a}
Select table:;
[.40,5table_name^#tables!table_choice];{lf}
{endregion};
{frame=1,1}
{region=a};
Select fields to check for duplicates:;
[%m%.40,5fields_to_include^#fields!fields_chosen]|
|[%mw%.40,5fields_to_include]
{endregion};
{line=1,0};
{region=a}
Other fields to show in addition: ;
[%m%.40,5other_fields_to_show^#other_fields]|
|[%mw%.40,5other_fields_to_show]
{endregion};
<15OK> <15Cancel>
%dlg%,<<%code%
if a_dlg_button="table_choice" then
fields=table.external_field_name_get(table_name)
fields=filter_string(fields,"Memo",crlf(),.t.) ' exclude memo fields
fields=sortsubstr(fields,crlf())
a_dlg_button=""
end if
if a_dlg_button="fields_chosen" then
other_fields=filter_string_multi(fields,fields_to_include,crlf(),.t.)
a_dlg_button=""
end if
%code%)
' now, here comes the second Xdialog
if result="OK" then
deletion_list=""
dlg_title="Checking"
if right(fields_to_include,2)=crlf() then
fields_to_include=left(fields_to_include,len(fields_to_include)-2)
end if
query.filter=".t."
query.order=strtran(""+fields_to_include,crlf(),"+")
query.options=""
dim t as p
dim ix as p
t=table.open(table_name)
ix=t.query_create()
dim old_key as p
dim new_key as p
dim old_rec as c
dim new_rec as c
dim dupes as c
Next_label="Next"
Delete_label="Delete this one"
remove=0
t.fetch_first()
old_key=ix.key_get()
old_rec=*for_each(x,eval("t."+x),fields_to_include+crlf()+\
other_fields_to_show+crlf()+"recno()")
old_rec=strtran(old_rec,crlf(),"|")
t.persist() ' t.persist() is necessary to keep the table open during the modeless
' Xdialog, which runs *after* the rest of this script closes!
ui_modeless_dlg_box(dlg_title,<<%dlg%
{can_exit=on_exit}
{startup=next}
Potential duplicates: ;
[.60,4remove^#dupes];
<10=Next_label!next> <10Quit!on_exit> <20=Delete_label!delete?remove\>0>;
%dlg%,<<%code%
if a_dlg_button="on_exit" then
if deletion_list>"" then
*for_each(record,delete_one(t, record),deletion_list)
end if
t.close()
ui_modeless_dlg_close(dlg_title)
end if
if a_dlg_button="delete" then
active_record=word(dupes,remove,crlf())
rec_number=word(active_record,occurs("|",active_record),"|")
deletion_list=deletion_list+rec_number+crlf()
a_dlg_button="next"
end if
if a_dlg_button="next" then
dupes=""
remove=0
while .not. t.fetch_eof()
t.fetch_next()
if t.fetch_eof() then
remove=0
exit while
end if
new_key=ix.key_get()
if new_key.key=old_key.key then
new_rec=strtran(*for_each(x,eval("t."+x),fields_to_include+crlf()+\
other_fields_to_show+crlf()+"recno()"),crlf(),"|")
dupes=old_rec+crlf()+new_rec
remove=0
exit while
end if
old_key=new_key
old_rec=strtran(*for_each(x,eval("t."+x),fields_to_include+crlf()+\
other_fields_to_show+crlf()+"recno()"),crlf(),"|")
end while
if t.fetch_eof() then
if deletion_list>"" then
*for_each(record,delete_one(t, record),deletion_list)
end if
t.close()
ui_modeless_dlg_close(dlg_title)
end if
end if
%code%)
end if
end
function delete_one as v(t as p, record as c)
rec=val(record)
t.fetch_goto(rec)
t.change_begin()
t.delete()
t.change_end()
end function
You can copy this code into your editor and run it. You can see that there is room for improvement: the user interface for the second Xdialog leaves something to be desired, and also this duplicate checker is unable to compare blob fields, or combinations of fields whose length is greater than A5's maximum index key length (I think it's 255 bytes -- someone correct me if I'm wrong).
4/20/2003
Don't forget, we need your feedback to make this site better!