Signup
Welcome to... Canonfire! World of GreyhawK
Features
Postcards from the Flanaess
Adventures
in Greyhawk
Cities of
Oerth
Deadly
Denizens
Jason Zavoda Presents
The Gord Novels
Greyhawk Wiki
#greytalk
JOIN THE CHAT
ON DISCORD
    Canonfire :: View topic - Announcement: Search through Dragon Magazine
    Canonfire Forum Index -> World of Greyhawk Discussion
    Announcement: Search through Dragon Magazine
    Author Message
    Apprentice Greytalker

    Joined: Jan 13, 2016
    Posts: 115
    From: SF, CA

    Send private message
    Sat Apr 28, 2018 3:16 pm  
    Announcement: Search through Dragon Magazine

    I'm happy to announce that boccob the #greytalk bot now has the ability to search the text of Dragon Magazine and Strategic Review.

    Every issue will be searched and you will receive your results via private message.

    This project isn't complete yet, there may be bugs. Let me know here if you spot a problem or have a suggestion for the future.

    Use:

    !bsearch <search terms>

    - quotes or other punctuation are not needed - but there are definitely cases where if a search turns up nothing, you should consider trying an appropriate period or comma (I hope to improve this over time)
    - wildcards, regular expression, and booleans not currently supported

    !recent <number>

    - A peek at up to 100 (the default is 10) of the most recent searches performed


    Notes:

      - This command provides NO access to the underlying material, only the issue and page numbers where the search terms are found are returned

      - Due to limitations and/or corruption in my available data sources:

        o Some issues may never be available.

          * Currently the following issues are not indexed: 147-149


        o Some issues will have ads and other non-article text exposed.

        o I've spent many many hours verifying the accuracy of the text data, but I have not and will never undertake a truly exhaustive review - inaccuracies and mangled text may exist. This may impact the accuracy of your searches.

        o Columnar text has not always been extracted correctly, this may cause false positives or negatives when searching for specific phrases in SOME issues.


      - Due to inconsistent standards, there are likely to be off-by-one errors where the page numbers returned for a given issue are incorrect in terms of the true page count*

      - * Dragon Magazine has had at times, some really insane page numbering schemes if one goes by the page number shown on the physical page. I am only able to index as if counting from the first physical page, incrementing with each new page. I can't do a thing about the weird skips Dragon became fond of over time.

      - Search terms less than 4 characters long are disallowed - this number may increase depending on use

      - In the face of abuse, financial considerations, or entropy, this tool may disappear at any time without warning; if you value it, let me know.


    Request for Assistance:

    If you discover a search did not return a result you know for certain should have been returned (ie. you can find the search term in a given issue/page number, and didn't receive it in the results from boccob), please reply here with details so I can correct the text.

    As an example: During a test run I searched for "Tasha's Hideous Laughter" in Dragon #338, only to receive no results. However I know that the phrase occurs on page 35 (according to the issue's page numbering).

    All I need from anyone willing to provide help here is a message such as:

    "Tasha's Hideous Laughter", Dragon 338, page 35

    Thanks!

    TODO:

      - Find better copies of issues 147-149, which currently resist every attempt to repair them
      - Allow users to restrict the search space
      - Support additional search options
      - Maybe add additional data sources... open to suggestions (RPGA? something else?)
      - Seek cleaner sources or clean stuff up myself


    Fixes/Changes:

      - Indexed issues 5, 7, 11, 12, 14, 15, 17, 19, 20, 26, 35, 47, 49, 61, 63, 65, 67
      - Indexed Oerth Journal 1 - 25
      - Improved formatting of output for extracting by column
      - Repaired and indexed issues 32, 33, 36, 37, 38, 39, 40, 41, 81, 84, 94, 96, 97, 100, 102-110
      - Repaired and indexed issues 112, 113, 116, 119, 120, 122-124, 129-131, 133, 137, 138, 139, 141-146
      - Repaired and indexed issues 151-153, 155, 156, 158, 161, 162, 164, 165, 171, 172, 176-179, 182, 184, 187-189,191, 194, 195, 198, 201, 202, 209

    TL;DR you can search the text of most issues of Dragon and The Strategic Review (& Oerth Journal)


    Last edited by lamashtu on Sat May 12, 2018 4:43 pm; edited 12 times in total
    Adept Greytalker

    Joined: Jul 29, 2006
    Posts: 494
    From: Dantredun, MN

    Send private message
    Sat Apr 28, 2018 4:35 pm  

    I think I'm too old to understand what any of this is about.
    GreySage

    Joined: Jul 26, 2010
    Posts: 2695
    From: LG Dyvers

    Send private message
    Sat Apr 28, 2018 10:18 pm  

    This will be a very helpful tool to use, lamashtu. Thank you!

    Is this only for use on GreyTalk?

    The problem you experienced searching for, "Tasha's Hideous Laugher" may be explained by the fact that you left out the 't' in 'laughter' both times you typed it. Wink

    SirXaris
    _________________
    SirXaris' Facebook page: https://www.facebook.com/SirXaris?ref=hl
    Apprentice Greytalker

    Joined: Jan 13, 2016
    Posts: 115
    From: SF, CA

    Send private message
    Sat Apr 28, 2018 10:32 pm  

    Ha! I never claimed I could type well. ;P

    Sadly though, there was indeed an OCR error during that test.

    This is indeed for greytalk only - and I doubt I'll ever pursue another home for it. Actual participation in chat not required of course - all are welcome to pop in, run some searches, and disappear again.

    Also, for those who were present for my earlier testing, searches no longer take 8-10 minutes. Instead they're averaging around 90 milliseconds currently.
    CF Admin

    Joined: Jun 29, 2001
    Posts: 1477
    From: Wichita, KS, USA

    Send private message
    Sun Apr 29, 2018 7:31 am  

    lamashtu: Are you familiar with the Dragondex @ http://www.aeolia.net/dragondex/ and if so, are you able to import it as the baseline text index for your boccob search?

    Allan.
    _________________
    Allan Grohe (grodog@gmail.com)
    http://www.greyhawkonline.com/grodog/greyhawk.html
    Apprentice Greytalker

    Joined: Jan 13, 2016
    Posts: 115
    From: SF, CA

    Send private message
    Sun Apr 29, 2018 7:49 am  

    grodog: Yes, and yes - but I'm not sure there would be any benefit to try to tie the two together.

    I could provide relevant hits from Dragondex before/after the other results. Or I could create a separate command allowing users to query the Dragondex alone.

    My inspiration to take on this project was a recent question where someone was looking for an article in Dragon, and had a number of key words they associated with it. I thought people might find it useful to have a means to try to discover or re-discover such things.

    I'm certainly open to ideas that would make the tool more useful. Can you explain what you're envisioning?
    Adept Greytalker

    Joined: Jul 29, 2006
    Posts: 494
    From: Dantredun, MN

    Send private message
    Sun Apr 29, 2018 1:18 pm  

    OK, help. What it this? What is the greytalk bot? Is this a phone thing or an app or a website somewhere?
    Apprentice Greytalker

    Joined: Jan 13, 2016
    Posts: 115
    From: SF, CA

    Send private message
    Sun Apr 29, 2018 1:27 pm  

    vestcoat: Sorry, boccob is the name of the bot I created to support the #greytalk channel in IRC (home of the Thursday night chats).

    You could give the search a try by using the "Greytalk Chat Now!" button to the left.
    Apprentice Greytalker

    Joined: Jan 13, 2016
    Posts: 115
    From: SF, CA

    Send private message
    Mon Apr 30, 2018 5:41 pm  

    Indexed Oerth Journals 1 - 25.
    Apprentice Greytalker

    Joined: Jan 13, 2016
    Posts: 115
    From: SF, CA

    Send private message
    Sat May 12, 2018 5:39 pm  

    All issues of Dragon except 147, 148, and 149 are now indexed.

    If I ever manage to get copies of those last three issues that I can work with, I will add them. Otherwise, as there's an overall lack of interest, development has ceased.

    The tool will remain available indefinitely.
    Display posts from previous:   
       Canonfire Forum Index -> World of Greyhawk Discussion All times are GMT - 8 Hours
    Page 1 of 1

    Jump to:  

    You cannot post new topics in this forum
    You cannot reply to topics in this forum
    You cannot edit your posts in this forum
    You cannot delete your posts in this forum
    You cannot vote in polls in this forum




    Canonfire! is a production of the Thursday Group in assocation with GREYtalk and Canonfire! Enterprises

    Contact the Webmaster.  Long Live Spidasa!


    Greyhawk Gothic Font by Darlene Pekul is used under the Creative Commons License.

    PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
    Page Generation: 0.32 Seconds