By Bill Hanigsberg
The issue just won't go away. Whatever the theory says, intuitively we feel that we should be able to search on a sets child records. Of course, if we want to search parent records by a single child value, flattenquery() is a help as is an inverted set . But what if we need a list of the stores in which a single customer bought both franks and beans, or of newspaper articles dealing with birds and dogs. In other words, what if I want to find parent records which share multiple attributes stored in separate child records. A keyword search might help here but I find them a lot like the internet; they run slow and produce too many false positives.
However, would our users complain if we could make an application behave as if it were searching on child records. Would they even know? I am suggesting that we distinguish between how we implement a feature and how we wish users to experience it. I am not saying that these have to be different: only that they don't have to be the same.
This article describes a modest application which appears to sort on multiple overlapping child records. I only go into the essential features of the child-record search and leave out reports and most aspects of the interface.
Clipfiles.adb holds file clippingsthe full text of articles and other text either scanned in or obtained online. Mostly it contains newspaper articles. Article text is stored in a memo field. The database is comprised of four simple tables and a single set.
Articles.dbf holds the article text and has a few other fields for date, source and an id number which is assigned automatically using an auto-increment field rule. The topic_codes field is discussed further on.
1 ARTICLE_NO C 4
2 DATE D 8
3 SOURCE C 50
4 HEADLINE C 100
5 ABSTRACT C 255
6 TEXT M 10
7 TOPIC_CODES C 100 this field must be indexed
Sources.dbf is the source for the lookup field rule defined for the source field in the articles table.
1 SOURCE C 50
Topic_Article.dbf has fields for Article_no, Topic_no, Headline, and Topic
1 TOPIC_NO C 3 0
2 TOPIC_TITLE C 50 0
3 ARTICLE_NO C 4 0
4 HEADLINE C 100 0
5 ARTICLE_DATE D 8 0
Topics.dbf serves as a lookup source for data entry into Topic_Article.dbf and for creating queries topic based on topic categories.
1 TOPIC_NO C 3
2 TOPIC C 30
3 DESCRIPTION C 50
A lookup field rule is defined so the user chooses the topic and the Topic_no field fills automatically.
Article.set is a one-to-many link from Articles.dbf to Topic_Article.dbf. The linking field is Article_no and there is no referential integrity implemented.

Figure 1 Article.set showing topics by article.
Figure 1 shows the application main screen based on Article.set. Tab 1 holds the memo field containing the article text. Tab 3 contains a browse at the set level allowing the user to scan headlines of article in the current query. The active Tab is relevant to the issue at hand. An embedded browse shows child records linked to the current parent. Each topic record has a title and a number.
So far this is an absolutely typical one-to-many arrangement. But look at the sets default form and you observe a new wrinkle.

Figure 2 Article.set, default form.
The default form reveals that the Topic_codes field contains a series of three-digit strings which, you will observe, are the topic numbers of the child records for the current parent. Since these codes have been brought into the parent record we can obviously query on them. For example, we could run a query to find all article records for which the topic_codes field includes the substrings 025 and 026 and this would be equivalent to querying for articles classified under the categories USA and Health Care. In effect, we would have imitated a search on child records.
If this technique is to be useful in practice we must find a way to ensure that:
Although a trifle involved, this turns out to not be so difficult. Well take up the issues in order.
I figured that the easiest way to keep topic_codes accurate was to completely regenerate the field whenever a child record was saved. So I looked to the on_save field rule event in the child table, Topic_article.dbf.
The On_save event takes in all cases since it triggers on enters, changes and deletes. Originally, I also attached the script to the on_delete event, however I soon noticed that the code ran twice when a record was deleted: once for on_delete and again for on_save. Once I realized what was happening I removed it from the on_delete event. Ive left my annotations in the script as a reminder.
''Xbasic
''Update topic string in parent record when child records are added or
altered
''String in topic_codes field is used to search on child records
''This code was also attached to on_delete event
''Removed because on_save is triggered by deleting a record
''1--------------------------------------------------------------------------------------
on error goto 0
dim topics as p
dim parent as p
dim vKeyvals as c
dim global deleting_parent as l 'value set in parent's can_delete event
''2--------------------------------------------------------------------------------------
''Generate an error if the parent table is not open
on error goto Not_in_set
parent=table.get("Articles")
on error goto 0
''3--------------------------------------------------------------------------------------
''If the parent record is being deleted no need to update search string
''Not necessary as range_delete method does not trigger field rule events
''Leave it in for the time being
If deleting_parent then
End
End If
''4--------------------------------------------------------------------------------------
''Fetch through the table and stuff topic numbers into a variable
vKeyvals=""
topics=table.get("topic_article")
topics.Fetch_First() 'current query is imposed by set
while .not.topics.Fetch_Eof()
vKeyvals=vKeyvals+topics.topic_no+space(1)
topics.Fetch_Next()
end while
''Update the topic string field in parent record from the variable
parent.Change_Begin()
parent.Topic_Codes=vKeyvals
parent.Change_End()
END
''5--------------------------------------------------------------------------------------
Not_in_set:
on error goto 0
END
Script 1. OnSaveRecord code for topic_article.dbf.
Heres a brief rundown of the scripts logic. Ive numbered the sections.
In section 1 we explicitly declare variables.
Section 2 verifies that the topic_article table has been opened in a set and creates a pointer to it (parent). because if there is no parent table trying to update the topic_codes field will generate an error. The table_get method does double duty here because it creates a pointer to an open table but generates an error if the named table is not open. Should it occur, this error informs the script that the parent table is not open and the process jumps to section 5 and ends.
Section 3 tests for the value of a logical variable and aborts the script if it is true. The idea is to not bother updating the topic_codes field if the parent record is being deleted. Well get back to this in the discussion of the can_delete code of Articles.dbf below.
The real work takes place in section 4. We fetch through the child records and write the values of topic_no (followed by a space) to a variable (vKeyvals), each code concatenated with those from previous records. We then use the pointer to the parent which we created in section 2 and write the contents of vKeyvals to the Topic_codes field. Since we are changing an existing record we have to use change_begin and change_end methods to throw the table in and out of change mode.
Note that the fetch_first and fetch_last methods find the first and last records linked to the current parent. The set has created the required query on the child table and it is there for us to use. In consequence, the process does not need to run its own query and is virtually instantaneous.
Generate Queries Without Seeing Topic Codes
The second thing we hope to do is allow users to interact with the application as though they were actually querying on child records. I want to give them a Query button which when pushed calls up a dialog in which they can select and deselect topics to query. When theyve selected their topics they need a button to close the dialog and run the query. They also need a way of canceling out of the process.
Look back at figure 1 and you will see that the articles form has a Query by Topics button. The button calls up Tops_dialog, a form based on the Topics table. The code on the Query by Topics button refers to the results of user actions on Tops_Dialog so I will discuss this form first. Then Ill go back to the Query buttons code and describe how it uses the information captured by the dialog .

Figure 3. Topics dialog form
Tops_dialog is contains an embedded browse of topics. Users select or deselect topics by double-clicking on them. The code to accomplish this is attached to the embedded browses On_row_dbl_click event. I took the easy way out. A selected topic is merely a marked record in the topics table. Here is the code
''XBasic ''Toggle marked status of topic records If marked() then parentform.Unmark_record() Else parentform.mark_record() End If END
Script 2. On_row_double_click code for embedded browse in Tops_dialog form
The cancel button merely closes the form with parentform.close(). The code on the query_by_topics button will detect whether cancel was the last button pressed (see below).
The Run button captures user choices by creating a range which includes only marked records, fetching through them and creating a string of the child topic codes separated by spaces.
''Xbasic
btnRun on topics dialog
on error goto 0
hourglass_cursor(.T.) 'will be turned off by frmArticles calling script
''1--------------------------------------------------------------------------------------
dim global vTopic_codes as c
''2--------------------------------------------------------------------------------------
tbl = :Table.Open("topics.dbf")
ndx = tbl.Index_Primary_Get()
range.flags = RANGE_INDEX .or. RANGE_MARKED .or. RANGE_MARKED_DELETED
range.index_pointer = ndx
range.filter = ""
tbl.Range_Add()
''3--------------------------------------------------------------------------------------
vTopic_codes = ""
Tbl.Fetch_First()
while .not. Tbl.Fetch_Eof()
vTopic_codes = vTopic_codes + tbl.topic_no + space(1)
Tbl.Fetch_Next()
end while
''4--------------------------------------------------------------------------------------
tbl.Range_Drop()
parentform.close()
END
Script 3. btnRun code on Topics_dialog form
Section 1 declares a global variable to hold the users choices.
Section 2 creates a range of only marked records of the topics table. The marked records are the users selections.
Section 3 uses the same technique as Script 1 used to fill the Topic_codes field. The records are fetched and each topics code is written into a string with the codes separated by spaces.
Section 4 discards the range and closes the dialog.
The Query_by_Topics button on the Articles form registers the user actions on the dialog and uses them to create a query on the parent record based on the child records the user selected. It is at the heart of the process and needs a bit of exposition.
''Xbasic
btnQuery_by_topics in Articles form
' Search on Multiple Child Records
''Query topic_codes field for records containing all words (topic numbers)
in search string.
''Contents of search string obtained from topics table dialog
''Words in test string may be in any order.
''Number of words in test string varies.
''Custom function Search_exp preformats the query expression
'1-------------------------------------------------------------------------------------------
on error goto 0
dim frm as p
dim tbl as p
dim ndx as p
dim nrecs as n
dim global vTopic_codes as c
dim vMsg as c
dim mode as c
dim last_button as c
'2-------------------------------------------------------------------------------------------
''Save any open record
mode = parentform.mode_get()
if mode = "ENTER" .or. mode = "CHANGE"
parentform.commit()
end if
'3-------------------------------------------------------------------------------------------
''Topics dialog to get query topics
''vTopic_codes will be filled by topics form btnRun code
frm = form.load("tops_dialog","dialog")
frm.show()
last_button = frm.last_pressed()
frm.close() close the dialog form explicitly as the button on the
dialog form only hid it.
''Check whether user cancelled intentionally
if last_button = "btnCancel" then
vMsg = "Query cancelled by user."
ui_msg_box("Cancelling Search",vMsg,UI_STOP_SYMBOL+UI_OK)
end
end if
''If no user input then cancel query
if vTopic_codes = "" then 'no topics were selected
vMsg = "You haven't selected any topics to query."
ui_msg_box("Cancelling Search",vMsg,UI_STOP_SYMBOL+UI_OK)
end
end if
'4-------------------------------------------------------------------------------------------
''Run the query based on vTopic_codes
tbl = table.current()
query.description = "Multiple topics"
query.order = "date"
query.filter = Search_exp (vTopic_codes) see below for a the
Search_exp custom function
query.options = ""
ndx = tbl.Query_Create()
nrecs = ndx.Records_Get()
''Inform the user of the search results
''If there are no hits then discard the query and show all records
if nrecs = 0 then
parentform.index_set("Date")
vMsg = "No items satisfying this request were
found."+chr(13)+"Displaying all items."
else
vMsg = "Items found: "+alltrim(str(nrecs))
end if
ui_msg_box("Query results",vMsg,UI_INFORMATION_SYMBOL+UI_OK)
parentform.window_title="Articles by date"
parentform.resynch()
hourglass_cursor(.F.) 'was turned on by dialog form's btnRun code
END
Script 4. Query_by_topics button code in the Articles form
Section 1 Explicitly declare variables.
Section 2 Save any open record.
Section 3 opens Tops_dialog in (what else?) dialog mode. First off it captures whichever button was used to close the dialog. It then (really) closes the dialog form. Remember that the close action on the dialog form itself only hidbut did not closethe form. We then see what the user did in the dialog. If the user pressed cancel, or made no selections the process is aborted and the appropriate message is issued.
Section 4 runs the query and tells the user what was found. If nothing was found, all records are displayed in date order. The syntax is completely straightforward because the task of generating the query expression--the only hard parthas been displaced into a custom function (Search_exp) So to understand the expression we must look at the function.
function Search_exp as C(search_string as C) dim i as n dim count as n dim nwords as n dim tmpstr as c nwords = w_count(search_string) tmpstr = "" count = nwords-1 for i= 1 to count tmpstr = tmpstr+"\""+word(search_string,i)+"\""+"$topic_codes.and." next i tmpstr = tmpstr+"\""+word(search_string,nwords)+"\""+"$topic_codes" Search_exp = tmpstr end function
Script 5. Custom function to generate the query expression.
The function takes a string of topic codes and formats it into a query expression. The two following lines, copied from the interactive window show what it does:
?Search_exp ("001 002 003 ")
=
"\"001\"$topic_codes.and.\"002\"$topic_codes.and.\"003\"$topic_codes"
The topics the user selected are passed from the dialog to Search_exp via the global variable vTopic_codes. The function measures the number of topics by counting the words in the input string and uses $, the inclusion operator, to create an expression which separately tests whether each code exists in the topic codes field. Since we link the topics with the .and. operator, all topics must occur for an article to be included in the query list.
As I did, you might suppose it possible to test all the codes at once, for example test for an occurrence of the string 025 030 but unfortunately this won't work because if an article had been classified under topics 025 026 and 030 the topic_codes field would contain 025 026 030 and the test string 025 030 would not produce a match. Fortunately, even though the shortcut won't work the query runs very fast. Make sure to index the topic_codes field.
You will remember that the set design does not employ referential integrity so neither changes nor deletes of parent records cascade to the children. Not cascading changes is no problem since the linking key, article_no, is controlled by an autoincrement field rule in the parent. It need not appear on any forms since a user would have no reason to edit it. The absence of cascading deletes is another matter since every deleted parent record would leave behind orphan records in the topic_article table.
I did, however, have a worry. Since the child tables on_save code makes reference to a field in the matching parent record, I worried about the sequence of events. What if the child record being deleted tried to update the topic_codes field of the just-deleted parent? This is why I removed referential integrity and cascading deletes I had initially installed and tried to see if I could control the order of events making sure the child records were deleted prior to the parent. While I was at it, I wondered if I could prevent the childs on_save code from executing at all since all it would do is update the parent record a quarter of a second before it was removed.
This turned out to be easy, but also (partially) unnecessary.
The can_delete event triggers just prior to a deletion. My plan was to attach code deleting the child records to the parent records can_delete event. That way the parent would still exist as the child records were deleted so no error could occur.
''Xbasic
Articles.dbf CanDeleteRecord
''Delete child records *before* deleting parent so child table's on_delete
'' code doesn't hang
''Still will lose time updating search string in parent prior to deleting
'' parent but shouldn't take long and deletion will be infrequent
''Referential integrity is *off*
'1-------------------------------------------------------------------------------------------
''Generate an error if the child table is not open
on error goto Not_in_set
child=table.get("Topic_Article")
on error goto 0
'2-------------------------------------------------------------------------------------------
dim global deleting_parent as l 'for use in topic_article's on_save and
on_delete scripts
deleting_parent = .t. 'don't bother to update search string in parent record
Child.Delete_Range() 'Remove matching child records
deleting_parent = .f. Reset the flag
END
'3-------------------------------------------------------------------------------------------
Not_in_set:
on error goto 0
END
Script 6. Can_delete code in Articles.dbf.
Section 1 Test to see if the child table is open by generating a pointer to it. If no, abort the process.
Section 2 Declare a logical variable Deleting_parent for use as a flag. Use the delete_range method to remove the child records. This method is particularly convenient. If you don't specify a range it uses the current range which (if the set is open) is exactly what you want, the children of the current parent. But it is even more convenient than I knew. The documentation describes <tbl>,delete_range() as a high-level utility function. As a result it does not trigger the on_delete code in the child table and my flag variable is not needed.
But I didn't know that yet which is why I included section three in the topic_article tables on_save code to abort the script when deleting_parent is true. Ive left it in because I may be adding other code which I want to run conditionally. A simple autoexec script initializes deleting_parent to false when the application starts.
''XBasic ''Autoexec for Clip file application :A5.window_title="Clipfile" dim global deleting_parent as l 'for use in on_save and on_delete scripts deleting_parent = .f. 'initialize to the conservative value END
Script 7 Autoexec script initializes deleting_parent flag.
Deceive our users though we may, we ought never deceive ourselves. We are not really querying a set on values held in its child records and the technique described here is a workaround rather than a solution. This is a pity because solutions are generic whereas workarounds are specific--to particular situations, types of data, sizes of table, etc.
The most important limitation is the number of child records. A true one-to-many can in theory have an infinite number of child records but this technique breaks down as soon as the number of child records becomes too big to code into the parent topic_codes field. The 100 character length of this field accommodates 25 child records because each code takes three characters plus a space.
You will notice that the 100 character length of the topic_codes field in the articles table happens to be just about the maximum length of an Alpha 5 index key. I hoped that limiting the field to the key length would improve query execution. The keywords function is designed for use in building indexes on long character fields. Perhaps an index expression using this function would run faster (perhaps optimizing better) than one built directly on the field. That is
keywords(Topic_codes) instead of merely Topic_codes
I have experimented a bit with this and I must say that the queries I have run--on a table with hundreds rather than thousands of records, it must be admittedrun instantaneously on a slow machine regardless of how the index is constructed.
I also wonder whether an index which used the keywords function would permit me to increase the length of the topic_codes field to 255 characters without compromising performance. I haven't tried this yet.
A second limitation of the technique is that it would be difficult to implement in situations where child records are not standardized by a lookup table. The contents of the topic_codes field is restricted to a delimited set of words it is simple to provide users with an easily-understood interface with which to generate their queries.
However, under certain circumstances these limitations may not be burdensome. I compile a lot of bibliographies and reference lists and it is difficult to imagine needing to classify a book under more than 25 categories. In fact, I don't recall ever using 10. Also, categorization would almost always be structured by lookup tables. Finally, although bibliographies may hold thousands of records they would rarely contain hundreds of thousands. There must be many similar situations where the technique could be helpful.
So, even though it is not yet Christmas in July, this approach mayon occasion--make it possible for us to achieve our objectives.

Figure 4 Report illustrating a favorite set of overlapping categories.
| This is Bill's first contribution to learnalpha.com, and we hope it won't be his last. Bill is a member of the Department of Sociology at Dawson College in Westmount, Quebec. You can reach Bill at whanigsberg@dawsoncollege.qc.ca. Bill has provided a download to illustrate the concepts in this article. |
7/17/00
Don't forget, we need your feedback to make this site better!