Before the "transition", would it be feasible to scrape/index the current solidworks userforum to at least get it into a useable/searchable file that could be messed around with by future more ambitious programmers?
I'm just assuming that it will be much easier to do (if it's possible) in its current incarnation than attempting to do anything like that post transition to the swamp. Just a thought for those with way more programming experience than me.
I would think you could write a script to systematically navigate through threads, pull the source (HTML?) copy it to a file and repeat. As I said, the information on its own might not be very usable in that type of format. But, then it would at least be available to be transformed in the future.
Index Scrape/Crawl of Solidworks Forum
Index Scrape/Crawl of Solidworks Forum
Designated Pot-Stirrer
Re: Index Scrape/Crawl of Solidworks Forum
I am hoping the waybackmachine will take care of that for us.
https://web.archive.org/web/20201202134 ... solidworks
https://web.archive.org/web/20201202134 ... solidworks
-
I may not have gone where I intended to go, but I think I have ended up where I needed to be. -Douglas Adams
I may not have gone where I intended to go, but I think I have ended up where I needed to be. -Douglas Adams
Re: Index Scrape/Crawl of Solidworks Forum
There are some software packages that do this. In the first week we were up, we had a minor scandal where someone actually started doing that and then posted it here. The SW Forum has as part of it's terms of use that you cannot post the content of the SW Forum publicly in another place. So that pretty much covers that.jmongi wrote: ↑Thu Mar 25, 2021 8:33 am Before the "transition", would it be feasible to scrape/index the current solidworks userforum to at least get it into a useable/searchable file that could be messed around with by future more ambitious programmers?
I'm just assuming that it will be much easier to do (if it's possible) in its current incarnation than attempting to do anything like that post transition to the swamp. Just a thought for those with way more programming experience than me.
I would think you could write a script to systematically navigate through threads, pull the source (HTML?) copy it to a file and repeat. As I said, the information on its own might not be very usable in that type of format. But, then it would at least be available to be transformed in the future.
But they can't control (or more importantly litigate) if you give an account of the same content in your own words. (basically, don't copy/paste anything, but you can summarize or elaborate or this or that, but please don't copy or scrape and then paste here). I want to make it on our own merits rather than resort to copying content (and fending legal jousting).
Blog: http://dezignstuff.com
- jcapriotti
- Posts: 1842
- Joined: Wed Mar 10, 2021 6:39 pm
- Location: The south
- x 1191
- x 1978
Re: Index Scrape/Crawl of Solidworks Forum
Yeah, I imagine the Dassault legal team is more competent than the 3dswym team.
Jason
Re: Index Scrape/Crawl of Solidworks Forum
I didn't consider the legal aspects of such an activity. Good point.
Designated Pot-Stirrer
Re: Index Scrape/Crawl of Solidworks Forum
@SPerman The wayback machine will work great for surface content, but I don't think it will keep the source files for post attachments like macros or models. Will this stuff be completely lost or is it going to be transferred to the new forum?