- To: Sebastian Spiess <sebastian.spiess@xxxxxxxxx>
- Subject: Re: [SLUG] search engine for company network (OT)
- From: Glen Turner <gdt@xxxxxxxxx>
- Date: Wed, 14 May 2008 11:10:10 +0930
- Cc: slug@xxxxxxxxxxx
- Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAALVBMVEV8hICwZ1odCw7OiXlO ODMEBAvgy8WynJtXR2ZLIy+SWk2yeGL///8fHyx4QDQ69jxnAAACY0lEQVQ4jY3TMWsbMRQAYA8l 9pDlyHa3iXjIkMHc0MFQKGjwmZ6nHu2SrTS0eEip6ZW0gQyldaGB1tyiIYGb0jyXg+OGwAlCh2IH 4qGYmsTQGwJarIJ+QyXbyclNC32jPj1JT08qDP4Rhf+B3t+hv5dlF8Wb8HLoqTj/A3p73jyWF6E1 uoKHC9DPyl5ZDZevU2bwbVhGSMnGyHugww5CY4JUDL2NpgYuGgc46AynaT9z6LtrgRAN5uwj5M3X msJpRgSAABeplPs5PBkLMC0wjtBUmtfQqgNYNKSGQ9B62Xt9DbfqZphUEpuaYj8bzSqZQRiv3H5u 21UzcvYzLaPUTWwV1FCQaRlziI0IL0DrCsyIZVmWL/XIoleAF+DUAOjKMzHByDj7kcPXlANP4khe yjrJah/yu0qB8ZVKgwVsGLgXGtDwc2jf+WTBBBP3/Zd8qaRKLfvkLjU5sHYn3+NM3kZSPeE2pTBp rw10kGVLsCvRpLSqNUqVITMSO46c0vdFqCYnQgIj7Xs3wPllpxKWc+hZNI7DI3RpmROyqz+fMO5S 0w14aDrkrQ5pCqEFEeOASVGHbSuuhyZDl2mACzo8pTQ+phE30oNaSYe+hJUK8DBt1N680KDXpVQ2 nEMkanqjBoNnEmIAub+L1s81OLMkmCCE00HehQb9kCYxRBg78o3uarAJNJE9J5iQzmppOQcfuqoX ErDzsfQuP9UrDiZEDnYOBDtsF5tzeOxv10EwjHHAQWwViksz2PT9nUu1M3YaHA63fL+goOfL2OrK HyVFhMaxgqVBYTru+1QVIYT8P9N5xeZvAufMcHaieZQAAAAASUVORK5CYII=
- Openpgp: url=http://www.gdt.id.au/~gdt/gdt.gdt.id.au.pubkey.asc
- User-agent: Thunderbird 2.0.0.12 (X11/20080226)
Sebastian Spiess wrote:
Does anyone has a idea, something I could investigate further? a
software name?
I index my server's disks using htdig. There are backends for .PDF
.DOC, OpenDocument and so on and it's not at all difficult to add
support for other file formats (basically you write a small program
to spit out the text in the file. I wrote one to pull the ID3 tags
from my music files, based on that I wouldn't expect any trouble
writing one for DXF.)
The way it works is that I present my servers disks via Samba, NFS
and WebDAV. Reading WebDAV is just like reading a web server. So
htdig will index it fine and when users search they use the web
interface and pull the matching file using HTTP when they click
on the link. Obviously you protect both htdig and the WebDAV
using HTTPS and authentication.
htdig isn't perfect. But it's a nice lightweight search engine,
well worth the hassle installing and will get you started enough
so that if you want something heavier then you'll have a much
better notion of your requirements.
It took me as long to set up consistent authentication between
Samba, NFS and Apache as to do everything else. Your mileage
may vary depending what mechanism you use for authentication.
--
Glen Turner