OT: help with search

bruce badouglas at gmail.com
Mon Nov 2 19:02:46 UTC 2015


snidely.. and your TOPHAT!!

thanks for the laugh bro!


On Mon, Nov 2, 2015 at 1:55 PM, Rick Stevens <ricks at alldigital.com> wrote:
> On 11/02/2015 10:48 AM, bruce wrote:
>>
>> Rick!!!
>>
>> Really, you're throwing this person to java!!!  You're a cruel dude!!!
>
>
> Muah-hah-hah! (twirling moustache and cackling evilly!)
>
>> But yeah, a web crawler is what you need. As stated, do a goog search,
>> there are plenty out there. You just need an easy to use crawler that
>> you can point a site at, and the crawler will then iterate through all
>> the pages on/in the domain, retrieving all the links for you.
>
>
> I sorta chose that one as it has a fairly simple "download and use"
> thing going for it.
>
>
>> On Mon, Nov 2, 2015 at 1:31 PM, Rick Stevens <ricks at alldigital.com> wrote:
>>>
>>> On 11/02/2015 10:22 AM, jd1008 wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On 11/02/2015 11:15 AM, bruce wrote:
>>>>>
>>>>>
>>>>> ok...
>>>>>
>>>>>
>>>>> so you have a 'local' site, not a page, and you want to extract/get
>>>>> all the links for the 'domain' of the site you're looking at.
>>>>>
>>>>> you're going to have to have an app/process that crawls the site, and
>>>>> generates the links.
>>>>>
>>>>> there are a bunch of open source stuff to allow you to craft a process
>>>>> to do this, depending on your skillset.  (not sure what your dev
>>>>> level/skillset is)
>>>>>
>>>>> you might also have 'plugins' for the browser that will more or less
>>>>> generate this kind of data.
>>>>>
>>>>> webscraping/crawling/links  <<< terms if you need them.
>>>>>
>>>>> let us know what else you need.
>>>>>
>>>>>
>>>>> On Mon, Nov 2, 2015 at 12:53 PM, jd1008 <jd1008 at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/01/2015 08:01 PM, bruce wrote:
>>>>>>>
>>>>>>>
>>>>>>> hey...
>>>>>>>
>>>>>>> is your issue, you have a specific site you can point to, and you
>>>>>>> want
>>>>>>> to get links off the site?
>>>>>>>
>>>>>>> or is it something else?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Nov 1, 2015 at 7:30 PM, jd1008 <jd1008 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I googled for a way to list all items found on a single page.
>>>>>>>> What I am searching for is very very specific (in double quotes)
>>>>>>>> and only on a specific web site:
>>>>>>>> FOr example:
>>>>>>>>
>>>>>>>> my_Favorite_Site.com: "my specific phrase" -some_word
>>>>>>>>
>>>>>>>>
>>>>>>>> It comes up with a total of 12K hits on that web site.
>>>>>>>>
>>>>>>>> I need a way list the URL's of all the hits, or find a
>>>>>>>> way to easily capture the URL's of all hits without the
>>>>>>>> rigmarole of Rightclick on each link and copy url.
>>>>>>>>
>>>>>>>> Has anyone found a way to accomplish this?
>>>>>>>> --
>>>>>>>> users mailing list
>>>>>>>> users at lists.fedoraproject.org
>>>>>>>> To unsubscribe or change subscription options:
>>>>>>>> https://admin.fedoraproject.org/mailman/listinfo/users
>>>>>>>> Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
>>>>>>>> Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
>>>>>>>> Have a question? Ask away: http://ask.fedoraproject.org
>>>>>>
>>>>>>
>>>>>> Just the links, so I can put them in a text file for another program
>>>>>> to
>>>>>> go through them.
>>>>>> --
>>>>>>
>>>> No.
>>>> Not  a local page. It can be any public search engine,
>>>> and it can be any specific phrase.
>>>> I already provided an example.
>>>> But the example does not give me just the raw texts of the links of the
>>>> hits found,
>>>> nor does it give all of them in one fell swoop which you could save to a
>>>> text file.
>>>
>>>
>>>
>>> Try this:
>>>
>>>          https://www.cs.cmu.edu/~rcm/websphinx/
>>>
>>> Perhaps that'll do what you want.
>>> ----------------------------------------------------------------------
>>> - Rick Stevens, Systems Engineer, AllDigital    ricks at alldigital.com -
>>> - AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
>>> -                                                                    -
>>> -         The world is coming to an end ... SAVE YOUR FILES!!!       -
>>> ----------------------------------------------------------------------
>>>
>>> --
>>> users mailing list
>>> users at lists.fedoraproject.org
>>> To unsubscribe or change subscription options:
>>> https://admin.fedoraproject.org/mailman/listinfo/users
>>> Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
>>> Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
>>> Have a question? Ask away: http://ask.fedoraproject.org
>
>
>
> --
> ----------------------------------------------------------------------
> - Rick Stevens, Systems Engineer, AllDigital    ricks at alldigital.com -
> - AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
> -                                                                    -
> -   "Do you suffer from long-term memory loss?"  "I don't remember"  -
> -                            -- Chumbawumba, "Amnesia" (TubThumping) -
>
> ----------------------------------------------------------------------
> --
> users mailing list
> users at lists.fedoraproject.org
> To unsubscribe or change subscription options:
> https://admin.fedoraproject.org/mailman/listinfo/users
> Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
> Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
> Have a question? Ask away: http://ask.fedoraproject.org


More information about the users mailing list