提交 31cf2a17 authored 作者: 梁业锦's avatar 梁业锦 💬

增加了许多新的爬虫,重新排布了一下爬虫模块的代码

上级 d469bc9d
......@@ -69,14 +69,13 @@
- 命名:hm
- 爬虫进度:**已完成**
### 8.LiLy
- 主页:http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore
- 命名:lily
- 爬虫进度:已完成分析,待处理
- 数据嵌在HTML中,数据较难处理,延后爬取
### 9.Eifini
### 9.Eifini(伊芙丽)
- 主页:https://eifini.tmall.com/
- 命名:eifini
- 爬虫进度:未知方法
......@@ -91,11 +90,11 @@
- 数据接口:http://wap.ur.com.cn/product/product/detail?id=ff8080816dbb693e016dfd58f27c45d9
- 可用但存在的缺陷:
### 11.Aber Crombie & Fitch
### 11.[Aber Crombie & Fitch](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/AberCrombieFitchSpider.java)
- 主页:https://www.abercrombie.cn/zh_CN/home
- 命名:abercrombie
- 爬虫进度:存在反爬机制
- 链接做了编码形式的反爬机制
- 爬虫进度:**已完成**
- 有反向代理的反爬机制,暂留破解
### 12.[Under Armour(安德玛)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/UnderArmourSpider.java)
- 主页:https://www.underarmour.cn/
......@@ -105,6 +104,7 @@
- 效率太慢
- 主图失效
- 尺码不对应库存
### 13.Converse(匡威)
- 主页:https://www.converse.com.cn/
- 命名:converse
......@@ -154,10 +154,10 @@
- 数据接口:https://china.coach.com/rest/default/V1/applet/product/CONF69007_LPK
- 存在缺陷:还需要判断是否存在颜色或尺寸的数据
### 20.Revolve
### 20.[Revolve](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/RevolveSpider.java)
- 主页:https://www.revolve.com/wrangler/br/57f1a1/?utm_source=baidu&utm_medium=cpc&utm_campaign=intl_P_cn-d-Wrangler
- 命名:reolve
- 爬虫进度:
- 爬虫进度:**已完成**
### 21.Vans(范斯)
- 主页:https://vans.com.cn/gallery-index---0---36.html
......@@ -169,20 +169,21 @@
- 命名:
- 爬虫进度:天猫代理网站
### 23.Oysho
### 23.[Oysho](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/OyshoSpider.java)
- 主页:https://oysho.tmall.com/ (SPORT WEAR)
- 命名:
- 爬虫进度:天猫代理网站
- 爬虫进度:**已完成**
- 优化处理链接的商品 id
### 24.[Stradivarius(斯特拉迪瓦里斯)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/StradivariusSpider.java)
- 主页:https://www.stradivarius.cn/cn/
- 命名:stradivarius
- 爬虫进度:**已完成**
### 25.Maje
### 25.[Maje](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/MajeSpider.java)
- 主页:https://www.maje.cn/home?utm_campaign=maje-competitor-u&utm_content=competitor-ph&utm_medium=cpc&utm_source=baidu&utm_term=marc%e6%98%af%e4%bb%80%e4%b9%88%e7%89%8c%e5%ad%90
- 命名:maje
- 爬虫进度:
- 爬虫进度:**已完成**
### 26.[Gucci(古驰)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/GucciSpider.java)
- 主页:https://www.gucci.cn/zh/?utm_source=baiducpc_cn&utm_medium=cpc&utm_term=Title2&utm_content=URL_E-commerce_HomePage&utm_campaign=BD_PC_URL_E-commerce&src=Baidu&medium=PPC&Network=1&kw=133669705534&ad=31701433238&ag_kwid=23329-4-575c9bdeeef5d28f.9143bf3bdcd3718c
......@@ -199,13 +200,13 @@
- 命名:prada
- 爬虫进度:**已完成**
### 29.Fendi(芬迪)
### 29.[Fendi(芬迪)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/FendiSpider.java)
- 主页:https://www.fendi.cn/?utm_source=Baidu&utm_medium=PC&utm_campaign=NewBrand%20Pure&utm_content=B_Site
- 命名:fendi
- 爬虫进度:
### 30.HuaWei(华为)
- 主页:https://www.vmall.com/huawei?cid=78140 (huawei)
- 主页:https://www.vmall.com/huawei?cid=78140
- 命名:huawei
- 爬虫进度:
......@@ -214,12 +215,15 @@
- 命名:chanel
- 爬虫进度:
### 32.Apple(苹果)
### 32.[Apple(苹果)](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/AppleSpider.java)
- 主页:https://www.apple.com/cn/shop/buy-iphone/iphone-xr
- 命名:apple
- 爬虫进度:
- 爬虫进度:**已完成**
- 存在缺陷:
- 效率极慢,需重构代码
- 只能爬取iphone与ipad,待优化
### 33.Louisvuitton(路易威登LV)
### 33.LouisVuitton(路易威登LV)
- 主页:https://www.louisvuitton.cn/zhs-cn/homepage?campaign=sem_CN_ZHS_BA_EC_BZON_PC_Valuable_H1_homepage
- 命名:louisvuitton
- 爬虫进度:
......@@ -251,8 +255,8 @@
### 4.截取数据封装数据
- 新键该爬虫,请看**添加新爬虫至项目规范**
- [ZaraSpider.java](../src/main/java/com/diaoyun/zion/chinafrica/bis/impl/ZaraSpider.java)
- 如何处理数据详情请看爬虫的@see注释
- 如何处理数据详情请看每个爬虫的@see注释
> 每个购物网站的爬虫的处理逻辑都基本不会相同
# Java 处理爬取数据工具
## 获取内容工具
- 获取链接内的内容,以字符串的形式
......
......@@ -281,6 +281,35 @@
<artifactId>activation</artifactId>
<version>1.1.1</version>
</dependency>
<!-- dom4j 解析xml -->
<dependency>
<groupId>dom4j</groupId>
<artifactId>dom4j</artifactId>
<version>1.6.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.json/json -->
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20160810</version>
</dependency>
<!-- 爬虫框架 webmagic -->
<dependency>
<groupId>us.codecraft</groupId>
<artifactId>webmagic-core</artifactId>
<version>0.7.3</version>
</dependency>
<dependency>
<groupId>us.codecraft</groupId>
<artifactId>webmagic-extension</artifactId>
<version>0.7.3</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
......
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Abercrombie&Fitch 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("aberCrombieFitchSpider")
public class AberCrombieFitchSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* Abercrombie&Fitch 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ABERCROMBIEFITCH.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultJson = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultJson);
return resultJson;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 无库存信息
productResponse.setStockFlag(false);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 解析为 Document 对象
Document document = Jsoup.parse(content);
String pId = document.select("div[class=find-in-store display-none]").attr("pid");
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Abercrombie&Fitch");
itemInfo.setShopUrl("https://www.abercrombie.cn/");
itemInfo.setItemId(pId);
itemInfo.setTitle(document.select("meta[property=og:title]").attr("content"));
itemInfo.setPic(document.select("input[id=shoppingcartpic]").attr("value"));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
String fullPrice = document.select("meta[property=og:price:amount]").attr("content");
fullPrice = exchangeRate(fullPrice);
Elements colorsEle = document.select("ul[class=swatches color]").select("li");
Elements sizesEle = document.select("ul[class=product__sizes]")
.select("ul[class=swatches option va_1mprmry]").select("li");
for (int i = 0; i < colorsEle.size(); i++) {
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
String colorNo = i + "";
String color = colorsEle.get(i).attr("title");
String dataLgimg = colorsEle.select("span").attr("data-lgimg");
JSONObject datLgimgObj = JSONObject.fromObject(dataLgimg);
String imgUrl = datLgimgObj.getString("url");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
for (Element sizeEle : sizesEle) {
///////////////////////// 获取商品尺码属性 ////////////////////
String sizeNo = sizeEle.attr("title");
String size = sizeNo;
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 库存对应的id(颜色id + 尺码id)
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END///////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Abercrombie&Fitch");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,20 +2,24 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.AdidasSpiderParse;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Adidas 数据爬虫
*
......@@ -23,17 +27,10 @@ import java.util.concurrent.TimeoutException;
*/
@Component("adidasSpider")
public class AdidasSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* Adidas 商品详情页 Url
*/
private static final String ADIDAS_URL="https://www.adidas.com.cn/item";
/**
* Adidas 数据爬虫
* @see AdidasSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -46,12 +43,138 @@ public class AdidasSpider implements IItemSpider {
// 对应的商品数据接口
targetUrl = "https://www.adidas.com.cn/item/othercolor?itemCode=" + pId;
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ADIDAS.getValue());
// 格式化商品数据
ProductResponse productResponse = AdidasSpiderParse.formatProductResponse(content, pId);
ProductResponse productResponse = formatProductResponse(content, pId);
JSONObject resultJson = JSONObject.fromObject(productResponse);
// 翻译
TranslateHelper.translateProductResponse(resultJson);
return resultJson;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content, String pId) throws IOException, URISyntaxException {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 无库存信息
productResponse.setStockFlag(false);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 解析为 Document 对象
Document document = Jsoup.parse(content);
// 获取颜色(型号)
List<String> colorNoList = document.select("div[class=pdp-color events-color-close]")
.select("ul").select("li").eachAttr("code");
// 获取价格
String fullPrice = document.select("input[id=salePrice]").attr("value");
String originalFullPrice = exchangeRate(fullPrice);
// 获取尺码
List<String> pSizeList = document.select("div[class=overview product-size]")
.select("ul").select("li").eachText();
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Adidas");
itemInfo.setShopUrl("https://www.adidas.com");
itemInfo.setItemId(pId);
itemInfo.setTitle(document.select("input[id=itemTitle]").attr("value"));
itemInfo.setPic(document.select("input[id=shoppingcartpic]").attr("value"));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
for (String colorNo : colorNoList) {
String targetUrl = "https://www.adidas.com.cn/item/othercolor?itemCode=" + pId;
String colorContent = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ADIDAS.getValue());
Document colorDoc = Jsoup.parse(colorContent);
String color = colorDoc.select("input[id=colorDisPaly]").attr("value");
String imgUrl = colorDoc.select("input[id=shoppingcartpic]").attr("value");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
for (int i = 0; i < pSizeList.size(); i++) {
///////////////////////// 获取商品尺码属性 ////////////////////
String sizeNo = pSizeList.get(i);
String size = sizeNo;
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 库存对应的id(颜色id + 尺码id)
String skuStr = ";" + colorNo + ";" + pSizeList.get(i) + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(originalFullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(originalFullPrice);
productResponse.setSalePrice(originalFullPrice + "-" + originalFullPrice);
//////////////////////////////////// 获取原始价 END///////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Adidas");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.constant.KeyConstant;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.spider.SpiderUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* afri-eshop 数据爬虫
*
* @author G
*/
@Component("africaShopItemSpider")
public class AfricaShopItemSpider implements IItemSpider {
......@@ -32,12 +37,111 @@ public class AfricaShopItemSpider implements IItemSpider {
//获取商品相关信息,详情放在<script> 标签里 <script type="application/json" id="ProductJson-product-template">
resultObj = JsoupUtil.getScriptContentById(content, "ProductJson-product-template");
//格式化为封装数据
ProductResponse productResponse = SpiderUtil.formatAfricaShopProductResponse(resultObj);
ProductResponse productResponse = formatProductResponse(resultObj);
resultObj = JSONObject.fromObject(productResponse);
//翻译
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化 afric-eshop 返回数据
*
* @param resultObj
* @return
*/
private ProductResponse formatProductResponse(JSONObject resultObj) {
ProductResponse productResponse = new ProductResponse();
//原始价
List<OriginalPrice> originalPriceList = new ArrayList<>();
//促销价格
List<ProductPromotion> promotionList = new ArrayList<>();
//库存
DynStock dynStock = new DynStock();
//其实数据没有包含确切的库存数,这里默认给足量的库存
dynStock.setSellableQuantity(9999);
//nike 基本是 颜色、尺码属性
Map<String, Set<ProductProp>> productPropSet = new HashMap<>();
//商品基本信息
ItemInfo itemInfo = new ItemInfo();
JSONArray variantsArray = resultObj.getJSONArray("variants");
//属性
JSONArray optionsArray = resultObj.getJSONArray("options");
for (int i = 0; i < variantsArray.size(); i++) {
//属性
JSONArray itemOptionsArray = variantsArray.getJSONObject(i).getJSONArray("options");
//没有属性的时候,会返回 Default Title
if ("Default Title".equalsIgnoreCase(itemOptionsArray.getString(0))) {
break;
}
String skuStr = ";";
for (int m = 0; m < itemOptionsArray.size(); m++) {
skuStr = skuStr + KeyConstant.CUSTOMIZE_ID + itemOptionsArray.getString(m) + ";";
}
///////////////////原始价////////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
String price = variantsArray.getJSONObject(i).getString("price");
BigDecimal priceOld = new BigDecimal(price);
BigDecimal div = new BigDecimal("100");
BigDecimal priceNew = priceOld.divide(div, 2, BigDecimal.ROUND_DOWN);
originalPrice.setPrice(priceNew.toString());
originalPrice.setSkuStr(skuStr);
originalPriceList.add(originalPrice);
///////////////////原始价 END////////////////////////////////
////////////////////////////////////获取库存 ////////////////////////////////////////////
productResponse.setStockFlag(true);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
////////////////////////////////////获取库存 END////////////////////////////////////////////
//获取所有的属性
for (int j = 0; j < optionsArray.size(); j++) {
////////////////////////////////////获取商品属性////////////////////////////////////////////
//商品属性
Set<ProductProp> propSet = new HashSet<>();
ProductProp productProp = new ProductProp();
productProp.setPropId(KeyConstant.CUSTOMIZE_ID + itemOptionsArray.getString(j));
productProp.setPropName(itemOptionsArray.getString(j));
propSet.add(productProp);
if (productPropSet.get(optionsArray.getString(j)) == null) {
productPropSet.put(optionsArray.getString(j), propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get(optionsArray.getString(j));
propSet.addAll(oldPropSet);
productPropSet.put(optionsArray.getString(j), propSet);
}
////////////////////////////////////获取属性 END////////////////////////////////////////////
}
}
itemInfo.setItemId(resultObj.getString("id"));
//取第一张
itemInfo.setPic(resultObj.getString("featured_image"));
itemInfo.setShopName(PlatformEnum.AfriEshop.getValue());
itemInfo.setShopUrl("https://www.afri-eshop.com/");
itemInfo.setTitle(resultObj.getString("title"));
productResponse.setPropFlag(true);
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform(PlatformEnum.AfriEshop.getValue());
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
String price = resultObj.getString("price");
BigDecimal priceOld = new BigDecimal(price);
BigDecimal div = new BigDecimal("100");
BigDecimal priceNew = priceOld.divide(div, 2, BigDecimal.ROUND_DOWN);
productResponse.setPrice(priceNew.toString());
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Apple(苹果) 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("appleSpider")
public class AppleSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* Apple(苹果) 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.APPLE.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
*
* @param content 主要的页面数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) throws IOException, URISyntaxException {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 解析为 Document 对象
Document document = Jsoup.parse(content);
String pTitle = document.select("meta[property=og:title]").attr("content");
String imgUrl = document.select("meta[property=og:image]").attr("content");
String shopUrl = "https://www.apple.com/";
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Apple");
itemInfo.setShopUrl(shopUrl);
itemInfo.setTitle(pTitle);
itemInfo.setPic(imgUrl);
//////////////////////////////////// 获取商品基本信息End /////////////////////////
// 获取该商品下的所有款式的链接
Elements pUrlEle = document.select("div[class=selection-buttons]").select("div[class=item equalize-capacity-button-height ]");
for (Element element : pUrlEle) {
String targetUrl = element.select("a").attr("href");
if (!targetUrl.contains("https://www.apple.com")) {
targetUrl = shopUrl + targetUrl;
}
System.err.println(targetUrl);
String[] spilt = targetUrl.split("/");
String pId = spilt[spilt.length - 2] + "/" + spilt[spilt.length - 1];
content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.APPLE.getValue());
document = Jsoup.parse(content);
String skuStr = ";" + pId + ";" + pId + ";";
//////////////////////////////////// 获取款式 ////////////////////////////////////////////
pTitle = document.select("meta[property=og:title]").attr("content");
imgUrl = document.select("meta[property=og:image]").attr("content");
ProductProp productPropModel = new ProductProp();
productPropModel.setPropId(pId);
productPropModel.setPropName(pTitle);
productPropModel.setImage(imgUrl);
propSet.add(productPropModel);
if (productPropSet.get("款式") == null) {
productPropSet.put("款式", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("款式");
propSet.addAll(oldPropSet);
productPropSet.put("款式", propSet);
}
//////////////////////////////////// 获取款式 END ////////////////////////////////////////////
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
// 获取商品的原始价
String fullPrice = document.select("span[class=as-price-currentprice]").text();
fullPrice = SpiderUtil.retainNumber(fullPrice);
fullPrice = exchangeRate(fullPrice);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END///////////////////////////////
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Apple");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Burberry(博柏利) 数据爬虫
* TODO 需要模拟登录,暂留
* @author 爱酱油不爱醋
*/
public class BurberrySpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* Burberry(博柏利) 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String[] spilt = targetUrl.split("/");
spilt = spilt[5].split("[?]");
String pId = spilt[0];
String styleId = pId.substring(0, 5);
targetUrl = "https://www.gucci.cn/zh/pr/sameStyleBuriedPoint?itemCode=" + pId +"&style=" + styleId + "&categoryPath=&_=1572859976423";
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.GUCCI.getValue());
ProductResponse productResponse = formatProductResponse(content, pId);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
JSONObject dataMap = JSONObject.fromObject(content);
// 取 data 节点对象
JSONObject dataObj = dataMap.getJSONObject("data");
// 如果获取的长度为空,哪说明没有在商品的详情页内,返回空参数
if (dataObj.size() == 0) {
return productResponse;
}
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName(PlatformEnum.COACH.getLabel());
itemInfo.setShopUrl("https://china.coach.com");
itemInfo.setItemId(pId);
itemInfo.setTitle(dataObj.getString("name"));
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
List<String> sizeNoList = new ArrayList<>();
List<String> colorNoList = new ArrayList<>();
// 取 attributes 节点数组
JSONArray attributesArr = dataObj.getJSONArray("attributes");
for (int i = 0; i < attributesArr.size(); i++) {
///////////////////////// 获取商品颜色属性 ////////////////////////////////////////////////////////////////
// 0 位为颜色属性
if (i == 0) {
// 取 values 节点数组
JSONArray valuesArr = attributesArr.getJSONObject(i).getJSONArray("values");
for (int j = 0; j < valuesArr.size(); j++) {
JSONObject valuesObj = valuesArr.getJSONObject(j);
// 获取图片路径
String imageUrl = valuesObj.getString("image");
// 设置商品基本信息的图片
if (i == 0) {
itemInfo.setPic(imageUrl);
}
ProductProp productPropColor = new ProductProp();
String colorNo = valuesObj.getString("value_index");
colorNoList.add(colorNo);
productPropColor.setPropId(colorNo);
productPropColor.setPropName(valuesObj.getString("label"));
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
}
///////////////////////// 获取商品颜色属性End ////////////////////////////////////////////////////////////////
// 1 位为尺寸属性(有的商品不一定会存在,如手提包)
} else if (i == 1) {
// 取 values 节点数组
JSONArray valuesArr = attributesArr.getJSONObject(i).getJSONArray("values");
///////////////////////// 获取商品尺码属性 ////////////////////////////////////////////////////////////////
for (int j = 0; j < valuesArr.size(); j++) {
JSONObject valuesObj = valuesArr.getJSONObject(j);
ProductProp productPropSize = new ProductProp();
String sizeNo = valuesObj.getString("value_index");
sizeNoList.add(sizeNo);
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(valuesObj.getString("label"));
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
}
}
}
for (String colorNo : colorNoList) {
for (String sizeNo : sizeNoList) {
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
// 设置:可用库存值,未有可用的库存数据
productSkuStock.setSellableQuantity(999);
// 设置:库存对应的id
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
// 获取商品的原始价(存在优惠价格)
OriginalPrice originalPrice = new OriginalPrice();
String fullPrice = dataObj.getString("price");
fullPrice = SpiderUtil.retainNumber(fullPrice);
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(fullPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
originalPrice.setPrice(fullPrice);
originalPrice.setSkuStr(skuStr);
originalPriceList.add(originalPrice);
//////////////////////////////////// 获取原始价 END //////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform(PlatformEnum.COACH.getValue());
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,10 +2,11 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.CoachSpiderParse;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
......@@ -13,9 +14,12 @@ import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* COACH(蔻驰)
*
......@@ -27,7 +31,6 @@ public class CoachSpider implements IItemSpider {
/**
* Coach 数据爬虫
* @see CoachSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -39,9 +42,164 @@ public class CoachSpider implements IItemSpider {
targetUrl = "https://" + urlSpilt[2] + "/rest/default/V1/applet/product/CONF" + pId;
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.COACH.getValue());
JSONObject resultObj = JSONObject.fromObject(content);
ProductResponse productResponse = CoachSpiderParse.formatProductResponse(resultObj, pId);
ProductResponse productResponse = formatProductResponse(resultObj, pId);
if (productResponse.getItemInfo() == null) {
resultObj.put("message", "找不到此类网址的数据爬虫!");
return resultObj;
}
resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param dataMap 主要的Json数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 取 data 节点对象
JSONObject dataObj = dataMap.getJSONObject("data");
// 如果获取的长度为空,哪说明没有在商品的详情页内,返回空参数
if (dataObj.size() == 0) {
return productResponse;
}
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Coach");
itemInfo.setShopUrl("https://china.coach.com");
itemInfo.setItemId(pId);
itemInfo.setTitle(dataObj.getString("name"));
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
List<String> sizeNoList = new ArrayList<>();
List<String> colorNoList = new ArrayList<>();
// 取 attributes 节点数组
JSONArray attributesArr = dataObj.getJSONArray("attributes");
for (int i = 0; i < attributesArr.size(); i++) {
///////////////////////// 获取商品颜色属性 ////////////////////////////////////////////////////////////////
// 0 位为颜色属性
if (i == 0) {
JSONArray valuesArr = attributesArr.getJSONObject(i).getJSONArray("values");
for (int j = 0; j < valuesArr.size(); j++) {
JSONObject valuesObj = valuesArr.getJSONObject(j);
String colorNo = valuesObj.getString("value_index");
String color = valuesObj.getString("label");
String imageUrl = valuesObj.getString("image");
if (i == 0) {
itemInfo.setPic(imageUrl);
}
colorNoList.add(colorNo);
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
}
///////////////////////// 获取商品颜色属性End ////////////////////////////////////////////////////////////////
// 1 位为尺寸属性(有的商品不一定会存在,如手提包)
} else if (i == 1) {
// 取 values 节点数组
JSONArray valuesArr = attributesArr.getJSONObject(i).getJSONArray("values");
///////////////////////// 获取商品尺码属性 ////////////////////////////////////////////////////////////////
for (int j = 0; j < valuesArr.size(); j++) {
JSONObject valuesObj = valuesArr.getJSONObject(j);
String sizeNo = valuesObj.getString("value_index");
String size = valuesObj.getString("label");
sizeNoList.add(sizeNo);
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
}
}
}
for (String colorNo : colorNoList) {
for (String sizeNo : sizeNoList) {
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
// 获取商品的原始价(存在优惠价格)
String fullPrice = dataObj.getString("price");
fullPrice = SpiderUtil.retainNumber(fullPrice);
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(fullPrice);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setPrice(fullPrice);
originalPrice.setSkuStr(skuStr);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END //////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Coach");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* Eifini(伊芙丽) 数据爬虫
* TODO 未完成
* @author 爱酱油不爱醋
*/
@Component("eifiniSpider")
public class EifiniSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(EifiniSpider.class);
/**
* Eifini(伊芙丽) 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws InterruptedException, IOException, ExecutionException, URISyntaxException, TimeoutException {
return null;
}
}
......@@ -4,14 +4,14 @@ import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import net.sf.json.JSONObject;
import org.springframework.stereotype.Component;
import java.util.HashMap;
import java.util.Map;
/**
* 空的数据爬虫
*
* @author G
*/
@Component("emptyItemSpider")
public class EmptyItemSpider implements IItemSpider {
@Override
public JSONObject captureItem(String targetUrl) {
JSONObject resultMap=new JSONObject();
......
......@@ -2,11 +2,11 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.EspritSpiderParse;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
......@@ -14,9 +14,12 @@ import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Esprit(思捷) 数据爬虫
*
......@@ -26,14 +29,9 @@ import java.util.concurrent.TimeoutException;
public class EspritSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* Esprit(思捷) 商品详情页 Url
*/
private static final String ESPRIT_URL = "https://www.esprit.cn/product/";
/**
* Esprit(思捷) 数据爬虫
* @see EspritSpiderParse#formatProductResponse 格式化数据方法
*
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -41,10 +39,137 @@ public class EspritSpider implements IItemSpider {
public JSONObject captureItem(String targetUrl) throws InterruptedException, IOException, ExecutionException, URISyntaxException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ESPRIT.getValue());
JSONObject dataMap = JsoupUtil.getItemDetailByName(content, "window.__INITIAL_STATE__");
ProductResponse productResponse = EspritSpiderParse.formatProductResponse(dataMap);
ProductResponse productResponse = formatProductResponse(dataMap);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
*
* @param dataMap 主要的Json数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 取 product 下的 details 节点对象
JSONObject detailsObj = dataMap.getJSONObject("product").getJSONObject("details");
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Esprit");
itemInfo.setShopUrl("https://www.esprit.cn");
itemInfo.setItemId(detailsObj.getString("code"));
itemInfo.setTitle(detailsObj.getString("title"));
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
// 获取商品的原始价
String fullPrice = detailsObj.getJSONObject("salePrice").getString("amount");
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(fullPrice);
JSONArray values_0_Arr = detailsObj.getJSONArray("options").getJSONObject(0).getJSONArray("values");
JSONArray values_1_Arr = detailsObj.getJSONArray("options").getJSONObject(1).getJSONArray("values");
for (int i = 0; i < values_0_Arr.size(); i++) {
JSONObject values_0_Obj = values_0_Arr.getJSONObject(i);
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
String colorNo = values_0_Obj.getString("code");
String color = values_0_Obj.getString("displayName");
String imageUrl = values_0_Obj.getJSONArray("images").getJSONObject(0).getString("url");
if (i == 0) {
itemInfo.setPic(imageUrl);
}
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 //////////////////////////////////////////////////////////
for (int j = 0; j < values_1_Arr.size(); j++) {
JSONObject values_1_Obj = values_1_Arr.getJSONObject(j);
String sizeNo = values_1_Obj.getString("code");
String size = values_1_Obj.getString("displayName");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Esprit");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.XmlUtils;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* Fendi(芬迪) 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("fendiSpider")
public class FendiSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* Fendi(芬迪) 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.FENDI.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的网页内容
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) throws IOException, URISyntaxException {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 转换为 Document 对象
Document document = Jsoup.parse(content);
// 通过获取商品的 id 访问商品所有款式的接口
String pId = JsoupUtil.getScriptTagVariableContent(content, "window.productId");
String pUrl = "https://www.fendi.cn/api/rest/color?product_id=" + pId;
String pContent = HttpClientUtil.getContentByUrl(pUrl, PlatformEnum.FENDI.getValue());
pContent = XmlUtils.convertXmlIntoJSONObject(pContent);
JSONObject pUrlObj = JSONObject.fromObject(pContent);
//////////////////////////////////// 获取商品基本信息 //////////////////////////////////////////////////
itemInfo.setShopName("Fendi");
itemInfo.setShopUrl("https://www.fendi.cn/");
itemInfo.setItemId(pId);
itemInfo.setTitle(document.select("div[class=info__summary]").text().trim());
//////////////////////////////////// 获取商品基本信息End ///////////////////////////////////////////////
JSONArray dataArr = pUrlObj.getJSONObject("magento_api").getJSONObject("data").getJSONArray("data_item");
for (int i = 0; i < dataArr.size(); i++) {
JSONObject dataObj = dataArr.getJSONObject(i);
//////////////////////////////////// 获取商品颜色属性 //////////////////////////////////////////////////
// 获取每个款式的页面信息
pUrl = dataObj.getString("url");
content = HttpClientUtil.getContentByUrl(pUrl, PlatformEnum.FENDI.getValue());
String colorNo = dataObj.getString("id");
String imgUrl = dataObj.getString("image");
document = Jsoup.parse(content);
if (i == 0) {
itemInfo.setPic(imgUrl);
}
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(document.select("div[class=info__summary]").text().trim());
productPropColor.setImage("http://" + imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
String fullPrice = JsoupUtil.getScriptTagVariableContent(content, "window.final_price_value");
fullPrice = SpiderUtil.exchangeRate(fullPrice);
JSONObject skuObj = JsoupUtil.getItemDetailByName(content, "window.spStockItems");
System.err.println(skuObj);
JSONObject sizeObj = JsoupUtil.getItemDetailByName(content, "window.spConfig");
JSONObject attributesObj = sizeObj.getJSONObject("attributes");
// 往下迭代一层
Iterator iterator = attributesObj.keys();
JSONArray optionsArr = new JSONArray();
while(iterator.hasNext()){
String key = (String) iterator.next();
String value = attributesObj.getString(key);
optionsArr = JSONObject.fromObject(value).getJSONArray("options");
}
for (int j = 0; j < optionsArr.size(); j++) {
JSONObject optionsObj = optionsArr.getJSONObject(j);
///////////////////////// 获取商品尺码属性 ////////////////////
ProductProp productPropSize = new ProductProp();
String sizeNo = optionsObj.getJSONArray("products").toString();
sizeNo = SpiderUtil.retainNumber(sizeNo);
String size = optionsObj.getString("label");
productPropSize.setPropName(size);
productPropSize.setPropId(sizeNo);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 商品的库存id
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
ProductSkuStock productSkuStock = new ProductSkuStock();
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
// 获取库存数
int sellableQuantity = Integer.valueOf(skuObj.getJSONObject(sizeNo).getString("qty"));
productSkuStock.setSellableQuantity(sellableQuantity);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END///////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Fendi");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.master.util.*;
import com.diaoyun.zion.master.util.spider.SpiderUtil;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.apache.commons.lang3.StringUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Gap数据爬虫
*
* TOTO 不可用
* @author G
*/
@Component("gapItemSpider")
public class GapItemSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(GapItemSpider.class);
//Gap商品详情
/**
* GAP 商品详情页链接
*/
private static final String gapUrl="https://apicn.gap.cn/gap/store/product/list/searchProductByCondition.do";
/**
* 爬虫数据返回
* @param targetUrl 商品详情页路径
* @return
*/
@Override
public JSONObject captureItem(String targetUrl) throws IOException, InterruptedException, ExecutionException, TimeoutException {
JSONObject resultObj;
......@@ -47,18 +59,19 @@ public class GapItemSpider implements IItemSpider {
if(resultObj.getBoolean("success")) {
//格式化为封装数据
ProductResponse productResponse = SpiderUtil.formatGapProductResponse(resultObj.getJSONObject("data"));
ProductResponse productResponse = formatGapProductResponse(resultObj.getJSONObject("data"));
resultObj=JSONObject.fromObject(productResponse);
//翻译
TranslateHelper.translateProductResponse(resultObj);
}
return resultObj;
}
/**
* 获取商品链接的 id
* @param targetUrl
* @return
*/
private String getItemId(String targetUrl) {
String spuCode=targetUrl.substring(targetUrl.lastIndexOf("/")+1);
int firstUnder=spuCode.indexOf("_");
......@@ -66,6 +79,120 @@ public class GapItemSpider implements IItemSpider {
return spuCode.substring(firstUnder+1,lastUnder);
}
/**
* 格式化 gap 返回数据
*
* @param dataMap
* @return
*/
private ProductResponse formatGapProductResponse(JSONObject dataMap) {
ProductResponse productResponse = new ProductResponse();
//原始价
List<OriginalPrice> originalPriceList = new ArrayList<>();
//促销价格
List<ProductPromotion> promotionList = new ArrayList<>();
Map<String, Set<ProductProp>> productPropSet = new HashMap<>();
JSONArray productList = dataMap.getJSONArray("productList");
//商品信息
ItemInfo itemInfo = new ItemInfo();
for (int index = 0; index < productList.size(); index++) {
JSONObject propObj = productList.getJSONObject(index);
//////////////////获取价格//////////////////
JSONArray skuList = propObj.getJSONArray("skuList");
for (int i = 0; i < skuList.size(); i++) {
JSONObject skuValue = skuList.getJSONObject(i);
JSONArray attrSaleList = skuValue.getJSONArray("attrSaleList");
String skuStr = ";";
for (int m = 0; m < attrSaleList.size(); m++) {
JSONObject attrSale = attrSaleList.getJSONObject(m);
JSONArray attributeValueList = attrSale.getJSONArray("attributeValueList");
skuStr = skuStr + attributeValueList.getJSONObject(0).getString("code") + ";";
}
//原始价格
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
String listPrice = skuValue.getString("listPrice");
//转换汇率
listPrice = exchangeRate(listPrice);
originalPrice.setPrice(listPrice);
originalPriceList.add(originalPrice);
//促销价格
if (StringUtils.isNotBlank(skuValue.getString("salePrice"))) {
String salePrice = skuValue.getString("salePrice");
//转换汇率
salePrice = exchangeRate(salePrice);
productResponse.setPromotionFlag(true);
ProductPromotion productPromotion = new ProductPromotion();
productPromotion.setSkuStr(skuStr);
productPromotion.setPrice(salePrice);
promotionList.add(productPromotion);
}
}
//////////////////获取价格 END//////////////////
//////////////////获取商品属性//////////////////
JSONArray attrSaleList = propObj.getJSONArray("attrSaleList");
for (int i = 0; i < attrSaleList.size(); i++) {
JSONArray attributeValueList = attrSaleList.getJSONObject(i).getJSONArray("attributeValueList");
//商品属性
Set<ProductProp> propSet = new HashSet<>();
for (int j = 0; j < attributeValueList.size(); j++) {
ProductProp productProp = new ProductProp();
//获取图片,拿第一张
if (attributeValueList.getJSONObject(j).get("itemAttributeValueImageList") != null && !"null".equalsIgnoreCase(attributeValueList.getJSONObject(j).getString("itemAttributeValueImageList"))) {
JSONArray itemAttributeValueImageList = attributeValueList.getJSONObject(j).getJSONArray("itemAttributeValueImageList");
productProp.setImage(itemAttributeValueImageList.getJSONObject(0).getString("picUrl"));
}
productProp.setPropName(attributeValueList.getJSONObject(j).getString("attributeValueName"));
productProp.setPropId(attributeValueList.getJSONObject(j).getString("code"));
propSet.add(productProp);
}
String attributeFrontName = attrSaleList.getJSONObject(i).getString("attributeFrontName");
if (productPropSet.get(attributeFrontName) == null) {
productPropSet.put(attributeFrontName, propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get(attributeFrontName);
propSet.addAll(oldPropSet);
productPropSet.put(attributeFrontName, propSet);
}
}
//////////////////获取商品属性 END//////////////////
itemInfo.setItemId(propObj.getString("style"));
if (propObj.get("itemImageList") != null && !"null".equalsIgnoreCase(propObj.getString("itemImageList"))) {
JSONArray itemImageList = propObj.getJSONArray("itemImageList");
if (!itemImageList.isEmpty()) {
String pic = itemImageList.getJSONObject(0).getString("picUrl");
//取第一张当作主图
itemInfo.setPic(pic);
}
}
itemInfo.setShopName(PlatformEnum.GAP.getLabel());
itemInfo.setShopUrl("https://www.gap.cn/");
itemInfo.setTitle(propObj.getString("title"));
}
String minPrice = dataMap.getString("minPrice");
String maxPrice = dataMap.getString("maxPrice");
//转换汇率
minPrice = exchangeRate(minPrice);
maxPrice = exchangeRate(maxPrice);
//一口价
productResponse.setPrice(minPrice + "-" + maxPrice);
//一口价
productResponse.setSalePrice(minPrice + "-" + maxPrice);
//没有库存信息 需要另外获取
productResponse.setStockFlag(false);
//有商品属性
productResponse.setPropFlag(true);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setPromotionList(promotionList);
productResponse.setItemInfo(itemInfo);
productResponse.setPlatform(PlatformEnum.GAP.getValue());
productResponse.setProductPropSet(productPropSet);
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Gucci(古驰) 数据爬虫
*/
@Component("gucciSpider")
public class GucciSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* Gucci(古驰) 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String[] spilt = targetUrl.split("/");
spilt = spilt[5].split("[?]");
String pId = spilt[0];
String styleId = pId.substring(0, 5);
targetUrl = "https://www.gucci.cn/zh/pr/sameStyleBuriedPoint?itemCode=" + pId +"&style=" + styleId + "&categoryPath=&_=1572859976423";
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.GUCCI.getValue());
ProductResponse productResponse = formatProductResponse(content, pId);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的网页内容
* @param pId 截取到的商品 id
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
Document document = Jsoup.parse(content);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Gucci");
itemInfo.setShopUrl("https://www.gucci.cn/");
itemInfo.setItemId(pId);
itemInfo.setTitle(document.select("div[class=spice-fixed-change-rel-fixed]")
.select("a[id=product_main_image_0]").select("img").attr("alt"));
itemInfo.setPic(document.select("div[class=spice-fixed-change-rel-fixed]")
.select("a[id=product_main_image_0]").select("img").attr("srcset"));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
// 转换为 JSON 对象取 impressions 的数组节点
JSONObject dataMap = JSONObject.fromObject(content);
JSONArray impressionsArr = dataMap.getJSONObject("ecommerce").getJSONArray("impressions");
// 获取商品的款式 id
List<String> pIdList = new ArrayList<>(10);
// 声明一个集合,这个数组用来存储销售价格
List<Double> pPrice = new ArrayList<>(15);
for (int i = 0; i < impressionsArr.size(); i++) {
pIdList.add(impressionsArr.getJSONObject(i).getString("id"));
}
for (int i = 0; i < pIdList.size(); i++) {
// 获取每个款式的商品 id 以及网页内容
pId = pIdList.get(i);
String targetUrl = "https://www.gucci.cn/zh/pr/" + pId + "?listName=VariationOverlay";
try {
content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.GUCCI.getValue());
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
document = Jsoup.parse(content);
String fullPrice = impressionsArr.getJSONObject(i).getString("price");
fullPrice = exchangeRate(fullPrice);
///////////////////////// 获取商品颜色属性 ////////////////////////////////////////////////////////////////
String colorNo = pId;
String color = document.select("span[class=spice-color-material]").select("img").attr("alt");
String imageUrl = document.select("div[class=spice-fixed-change-rel-fixed]").select("a[id=product_main_image_0]").select("img").attr("srcset");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(pId);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
///////////////////////// 获取商品颜色属性End ////////////////////////////////////////////////////////////////
Elements sizeEle = document.select("div[class=spice-dropdown spice-dropdown-semi-simulation spice-none]").select("option");
List<String> sizeNoList = sizeEle.eachAttr("spice-data-value");
List<String> sizeList = sizeEle.eachText();
if (sizeNoList.size() > 0) {
for (int j = 0; j < sizeNoList.size(); j++) {
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
String sizeNo = sizeNoList.get(j);
String size = sizeList.get(j);
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
//////////////////////////////////// 获取库存/////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
// 存在没有对应尺码的说明
if (size.length() < 4) {
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
}
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
if (size.length() < 4) {
originalPriceList.add(originalPrice);
}
// 存储价格
pPrice.add(Double.valueOf(fullPrice));
productResponse.setPrice(fullPrice);
//////////////////////////////////// 获取原始价 END //////////////////////////////////
}
} else {
//////////////////////////////////// 获取库存/////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + "" + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
// 存储价格
pPrice.add(Double.valueOf(fullPrice));
productResponse.setPrice(fullPrice);
//////////////////////////////////// 获取原始价 END //////////////////////////////////
}
}
// 取存储的价格的最大值
Double minPrice = Collections.min(pPrice);
Double maxPrice = Collections.max(pPrice);
productResponse.setSalePrice(minPrice + "-" + maxPrice);
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Gucci");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,36 +2,39 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.HMSpiderParse;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* H&M 数据爬虫
*
* TODO 数据被处理,尚未方法爬取
*
* @author 爱酱油不爱醋
*/
@Component("hmSpider")
public class HmSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(UniqloSpider.class);
/**
* H&M 详情商品页url
*/
private static final String H_M_URL = "https://www2.hm.com/zh_cn/productpage";
/**
* H&M 数据格式化
* @param targetUrl
......@@ -42,28 +45,145 @@ public class HmSpider implements IItemSpider {
String[] spilt = targetUrl.split("productpage.");
targetUrl = "https://www2.hm.com/zh_cn/productpage." + spilt[1];
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.HM.getValue());
ProductResponse productResponse = HMSpiderParse.formatProductResponse(content);
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
// public static void main(String[] args) throws Exception {
// String targetUrl = "https://m2.hm.com/m/zh_cn/productpage.0806412004.html";
// String[] spilt = targetUrl.split("productpage.");
// targetUrl = "https://www2.hm.com/zh_cn/productpage." + spilt[1];
// String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.HM.getValue());
// // 获取主要数据并将转换 Json 数据及 Document 对象
// String detailStr = JsoupUtil.getScriptContent(content, "productArticleDetails");
// int firstBrackets = detailStr.indexOf("{");
// int lastbrackets = detailStr.lastIndexOf("}");
// String resultStr = detailStr.substring(firstBrackets,lastbrackets+1);
// resultStr = resultStr.replaceAll("\'", "\"")
// .replaceAll("\"image\": isDesktop [?] ", "")
// .replaceAll("\"fullscreen\": isDesktop [?] ", "")
// .replaceAll("\"zoom\": isDesktop [?] ", "");
// JSONObject dataMap = JSONObject.fromObject(resultStr);
// Document document = Jsoup.parse(content);
// }
/**
* 格式化返回数据
*
* @param content 页面数据
* @return 格式化后的数据
*/
public static ProductResponse formatProductResponse(String content) {
// 获取主要数据并将转换 Json 数据及 Document 对象
String detailStr = JsoupUtil.getScriptContent(content, "productArticleDetails");
int firstBrackets = detailStr.indexOf("{");
int lastbrackets = detailStr.lastIndexOf("}");
String resultStr = detailStr.substring(firstBrackets, lastbrackets + 1);
resultStr = resultStr.replaceAll("\'", "\"")
.replaceAll("\"image\": isDesktop [?] ", "")
.replaceAll("\"fullscreen\": isDesktop [?] ", "")
.replaceAll("\"zoom\": isDesktop [?] ", "");
JSONObject dataMap = JSONObject.fromObject(resultStr);
Document document = Jsoup.parse(content);
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("H&M");
itemInfo.setShopUrl("https://www2.hm.com/");
itemInfo.setItemId(document.select("div[class=article-code]").select("li").text());
itemInfo.setTitle(document.select("h1[class=primary product-item-headline]").text());
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
// 获取原始价
String fullPrice = document.select("div[class=primary-row product-item-price]").text();
fullPrice = SpiderUtil.retainNumber(fullPrice);
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(fullPrice);
BigDecimal priceOld = new BigDecimal(fullPrice);
BigDecimal div = new BigDecimal("100");
fullPrice = priceOld.divide(div, 2, BigDecimal.ROUND_DOWN).toString();
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
// 取页面的数据
Elements colorEle = document.select("div[class=mini-slider]").select("ul[class=inputlist clearfix]").select("li");
for (Element element : colorEle) {
String colorNo = element.select("a").attr("data-articlecode");
String color = element.select("a").attr("data-color");
String imgUrl = "http:" + element.select("noscript").attr("data-src");
itemInfo.setPic(imgUrl);
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////////
JSONArray sizeArr = dataMap.getJSONObject(colorNo).getJSONArray("sizes");
for (int i = 0; i < sizeArr.size(); i++) {
JSONObject sizeObj = sizeArr.getJSONObject(i);
String sizeNo = sizeObj.getString("sizeCode");
String size = sizeObj.getString("name");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性END //////////////////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("H&M");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,11 +2,11 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.LeviSpiderParse;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
......@@ -14,9 +14,12 @@ import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Levi(李维斯)
*
......@@ -26,25 +29,154 @@ import java.util.concurrent.TimeoutException;
public class LeviSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* Levi(李维斯) 商品详情页 Url
*/
private static final String LEVI_URL = "https://www.levi.com.cn/product/";
/**
* Levi(李维斯) 数据爬虫
* @see LeviSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws InterruptedException, IOException, ExecutionException, URISyntaxException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ESPRIT.getValue());
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.LEVI.getValue());
JSONObject dataMap = JsoupUtil.getItemDetailByName(content, "window.__INITIAL_STATE__");
ProductResponse productResponse = LeviSpiderParse.formatProductResponse(dataMap);
ProductResponse productResponse = formatProductResponse(dataMap);
if (productResponse.getItemInfo() == null) {
JSONObject notSpiderObj = new JSONObject();
notSpiderObj.put("message", "未找到此网站的数据爬虫!");
return notSpiderObj;
}
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param dataMap 主要的Json数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 取 product 下的 details 节点对象
JSONObject detailsObj = dataMap.getJSONObject("product").getJSONObject("details");
// 判断如果不是商品详情页读取数据,则返回
if (!detailsObj.containsKey("code")) {
return productResponse;
}
// 获取商品的原始价
String fullPrice = detailsObj.getJSONObject("salePrice").getString("amount");
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(fullPrice);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Levi");
itemInfo.setShopUrl("https://www.levi.com");
itemInfo.setItemId(detailsObj.getString("code"));
itemInfo.setTitle(detailsObj.getString("title"));
//////////////////////////////////// 获取商品基本信息(图片下取)End /////////////////////////
JSONArray values_0_Arr = detailsObj.getJSONArray("options").getJSONObject(0).getJSONArray("values");
JSONArray values_1_Arr = detailsObj.getJSONArray("options").getJSONObject(1).getJSONArray("values");
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
// 取 options 的0位的 value 节点数组
for (int i = 0; i < values_0_Arr.size(); i++) {
JSONObject values_0_Obj = values_0_Arr.getJSONObject(i);
String colorNo = values_0_Obj.getString("code");
String color = values_0_Obj.getString("displayName");
String imageUrl = values_0_Obj.getJSONArray("images").getJSONObject(0).getString("url");
if (i == 0) {
itemInfo.setPic(imageUrl);
}
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ////////////////////////////////////////////////////////////////
// 取 options 的 1 位的 value 节点数组
for (int j = 0; j < values_1_Arr.size(); j++) {
JSONObject values_1_Obj = values_1_Arr.getJSONObject(j);
String sizeNo = values_1_Obj.getString("code");
String size = values_1_Obj.getString("displayName");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Levi");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.spider.LilySpiderParse;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
......@@ -34,7 +29,6 @@ public class LilySpider implements IItemSpider {
/**
* Lily 数据爬虫
* @see LilySpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -44,10 +38,11 @@ public class LilySpider implements IItemSpider {
return null;
}
public static void main(String[] args) throws Exception {
String targetUrl = "http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore/24003/276409";
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.LILY.getValue());
Document document = Jsoup.parse(content);
System.err.println(document);
}
// public static void main(String[] args) throws Exception {
// String targetUrl = "http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore/24003/276409";
// String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.LILY.getValue());
// Document document = Jsoup.parse(content);
//
// String str = document.select("input[id=skus]").attr("value");
// }
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* LouisVuitton(路易威登LV) 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("louisVuittonSpider")
public class LouisVuittonSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* LouisVuitton(路易威登LV) 数据爬虫
*
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws InterruptedException, IOException, ExecutionException, URISyntaxException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.FENDI.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
*
* @param content 主要的网页内容
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
Document document = Jsoup.parse(content);
Elements skuEle = document.select("div[id=infoProductBlock]");
// 获取商品的 id
String pId = skuEle.select("span[class=sku]").text();
// 获取该商品的一张图片
String imageUrl = document.select("li[id=productSheetSlideshowItem_0]").select("img").attr("src");
String[] spilt = imageUrl.split("[?]");
imageUrl = spilt[0];
// 获取价格
String fullPrice = document.select("div[class=productAction]").select("span[class=priceValuePurchaseLayer]").text();
fullPrice = SpiderUtil.retainNumber(fullPrice);
fullPrice = exchangeRate(fullPrice);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("LouisVuitton");
itemInfo.setShopUrl("https://inside.chanel.com/");
itemInfo.setItemId(pId);
itemInfo.setTitle(skuEle.select("h1[class=productName]").text());
itemInfo.setPic(imageUrl);
//////////////////////////////////// 获取商品基本信息End /////////////////////////
///////////////////////// 获取商品颜色属性 ////////////////////////////////////////////////////////////////
// TODO 此处还需要加颜色判断
String colorNo = pId;
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(pId);
productPropColor.setPropName(pId);
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
///////////////////////// 获取商品颜色属性End ////////////////////////////////////////////////////////////////
Elements sizeEle = skuEle.select("div[class=topPanelContent sizesPanel js-tracking]").select("ul[id=size]").select("li");
for (Element element : sizeEle) {
String sizeNo = element.attr("data-ona");
String size = element.select("span").text();
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
//////////////////////////////////// 获取库存/////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END //////////////////////////////////
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("LouisVuitton");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* MajeSpider 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("majeSpider")
public class MajeSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(MajeSpider.class);
/**
* Maje 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.MAJE.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
public static ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
Document document = Jsoup.parse(content);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
// itemInfo.setItemId(document.select(""));
itemInfo.setShopName("Maje");
itemInfo.setShopUrl("https://www.maje.cn/");
itemInfo.setTitle(document.select("meta[property=og:title]").attr("content"));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
String fullPrice = document.select("meta[property=product:price:amount]").attr("content");
fullPrice = SpiderUtil.exchangeRate(fullPrice);
Elements pContentEle = document.select("div[id=product-content]").select("ul[class=dropdown-content]");
Elements colorsEle = pContentEle.select("ul[class=swatches Color]").select("a");
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
for (Element colorEle : colorsEle) {
String colorNo = colorEle.attr("data-variationparameter");
String color = colorEle.attr("title");
// TODO 图片路径未处理
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
// productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
Elements sizesEle = pContentEle.select("ul[class=swatches size]").select("a");
for (Element sizeEle : sizesEle) {
String sizeNo = sizeEle.attr("data-variationparameter");
String size = sizeEle.attr("title");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END///////////////////////////////////////////////////
//////////////////////////////////// 获取库存与原始价 ////////////////////////////////////////////
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
if (size.length() < 3) {
productSkuStockList.add(productSkuStock);
}
dynStock.setProductSkuStockList(productSkuStockList);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
if (size.length() < 3) {
originalPriceList.add(originalPrice);
}
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Maje");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,17 +2,20 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.spider.MassimoDuttiSpiderParse;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
......@@ -23,17 +26,10 @@ import java.util.concurrent.TimeoutException;
*/
@Component("massimoduttiSpider")
public class MassimoduttiSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* Massimo Dutti 商品详情页Url
*/
private static final String MASSIMO_DUTTI_URL = "https://www.massimodutti.cn/cn/";
/**
* Massimo Dutti 数据爬虫
* @see MassimoDuttiSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -47,10 +43,143 @@ public class MassimoduttiSpider implements IItemSpider {
String dataUrl = "https://www.massimodutti.cn/itxrest/2/catalog/store/35009478/30359500/category/0/product/" + pId + "/detail?languageId=-7&appId=1";
String content = HttpClientUtil.getContentByUrl(dataUrl, PlatformEnum.MASSIMODUTTI.getValue());
JSONObject resultObj = JSONObject.fromObject(content);
ProductResponse productResponse = MassimoDuttiSpiderParse.formatProductResponse(resultObj, pId);
ProductResponse productResponse = formatProductResponse(resultObj, pId);
resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param dataMap 主要的 json 数据
* @param pId 商品链接的 id
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSetColor = new HashSet<>(16);
Set<ProductProp> sizePropSetSize = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
itemInfo.setShopName("MassimoDutti");
itemInfo.setShopUrl("https://www.massimodutti.cn/cn/");
itemInfo.setItemId(pId);
itemInfo.setTitle(dataMap.getString("name"));
//////////////////////////////////// 获取商品基本信息End(图片下取) ////////////////////////////////////////////
// 取 detail 节点对象
JSONObject detailObj = dataMap.getJSONObject("detail");
// 取 colors 节点数组
JSONArray colorsArr = detailObj.getJSONArray("colors");
for (int i = 0; i < colorsArr.size(); i++) {
JSONObject colorsObj = colorsArr.getJSONObject(i);
JSONObject imageObj = colorsObj.getJSONObject("image");
String imageUrl = "https://static.massimodutti.cn/3/photos"
+ imageObj.getString("url")
+ "_2_5_16.jpg?t="
+ imageObj.getString("timestamp");
if (i == 0) {
itemInfo.setPic(imageUrl);
}
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
String colorNo = colorsObj.getString("id");
String color = colorsObj.getString("name");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSetColor.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSetColor);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSetColor.addAll(oldPropSet);
productPropSet.put("颜色", propSetColor);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ////////////////////
// 取 sizes 节点对象
JSONArray sizesArr = colorsObj.getJSONArray("sizes");
for (int j = 0; j < sizesArr.size(); j++) {
JSONObject sizesObj = sizesArr.getJSONObject(j);
String sizeNo = sizesObj.getString("sku");
String size = sizesObj.getString("name");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropName(size);
productPropSize.setPropId(sizeNo);
sizePropSetSize.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSetSize);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSetSize.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSetSize);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 库存 id
String skuStr = ";" + colorNo + ";" + sizeNo;
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
String fullPrice = sizesObj.getString("price");
BigDecimal priceOld = new BigDecimal(fullPrice);
BigDecimal div = new BigDecimal("100");
BigDecimal priceNew = priceOld.divide(div, 2, BigDecimal.ROUND_HALF_UP);
fullPrice = SpiderUtil.exchangeRate(priceNew.toString());
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
// 按照一下顺序进行 json 数据的填充
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("MassimoDutti");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,10 +2,10 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.MocoSpiderParse;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
......@@ -13,9 +13,12 @@ import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* MO&Co. 数据爬虫
*
......@@ -23,17 +26,11 @@ import java.util.concurrent.TimeoutException;
*/
@Component("mocoSpider")
public class MocoSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* MO&Co. 商品详情页Url
*/
private static final String MOCO_URL = "https://www.moco.com/moco/zh/p/";
/**
* MO&Co. 数据爬虫
* @see MocoSpiderParse#formatProductResponse 格式化数据方法
*
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -47,12 +44,140 @@ public class MocoSpider implements IItemSpider {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.MOCO.getValue());
JSONObject resultObj = JSONObject.fromObject(content);
// 格式化数据
ProductResponse productResponse = MocoSpiderParse.formatProductResponse(resultObj, pId);
ProductResponse productResponse = formatProductResponse(resultObj, pId);
resultObj = JSONObject.fromObject(productResponse);
// 翻译数据
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
*
* @param dataMap 主要的 Json 内容
* @param pId 截取的商品 id
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(false);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 取 productData 对象节点
JSONObject productDataObj = dataMap.getJSONObject("productData");
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
itemInfo.setShopName("MO&Co.");
itemInfo.setShopUrl("https://en.mo-co.com/");
itemInfo.setItemId(pId);
itemInfo.setTitle(productDataObj.getString("name"));
//////////////////////////////////// 获取商品基本信息End(图片下取) ////////////////////////////////////////////
JSONArray options_1_Arr = productDataObj.getJSONArray("baseOptions").getJSONObject(1).getJSONArray("options");
JSONArray options_0_Arr = productDataObj.getJSONArray("baseOptions").getJSONObject(0).getJSONArray("options");
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
for (int i = 0; i < options_1_Arr.size(); i++) {
JSONObject options_1_Obj = options_1_Arr.getJSONObject(i);
// 获取图片的路径
String[] spiltImg = options_1_Obj.getJSONArray("variantOptionQualifiers")
.getJSONObject(0).getJSONObject("image").getString("url").split("_other_");
String colorNo = options_1_Obj.getString("epoColorCode");
String color = options_1_Obj.getString("epoColorName");
String imageUrl = "https://mallimg.moco.com/" + pId + "_list_" + spiltImg[1];
if (i == 0) {
itemInfo.setPic(imageUrl);
}
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////////
for (int j = 0; j < options_0_Arr.size(); j++) {
JSONObject options_0_Obj = options_0_Arr.getJSONObject(j);
String sizeNo = options_0_Obj.getString("epoSizeCode");
String size = options_0_Obj.getString("epoSizeName") + options_0_Obj.getString("sizeDescription");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END/////////////////////////////////////////////////////
// 设置 skuStr
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
String fullPrice = productDataObj.getJSONObject("price").getString("value");
fullPrice = exchangeRate(fullPrice);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("MO&Co.");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.google.gson.Gson;
import com.google.gson.JsonArray;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
......@@ -12,14 +11,18 @@ import java.net.URISyntaxException;
/**
* 网络接口调用
*
* @author G
*/
public class NetWorkSpider {
//调用和讯网上的汇率接口 http://webforex.hermes.hexun.com/forex/quotelist?code=FOREXUSDX,
// FOREXUSDCNY,FOREXEURUSD,FOREXUSDJPY,FOREXGBPUSD,FOREXUSDCAD,FOREXUSDCHF,
// FOREXAUDUSD,FOREXGBPJPY,FOREXXAUUSD&column=Price,Code,Name,UpdownRate,PriceWeight'
private static final String exchangeRateUrl = "http://webforex.hermes.hexun.com/forex/quotelist?";
/**
* 调用和讯网上的汇率接口 http://webforex.hermes.hexun.com/forex/quotelist?code=FOREXUSDX,
*
* FOREXUSDCNY,FOREXEURUSD,FOREXUSDJPY,FOREXGBPUSD,FOREXUSDCAD,FOREXUSDCHF,
* FOREXAUDUSD,FOREXGBPJPY,FOREXXAUUSD&column=Price,Code,Name,UpdownRate,PriceWeight'
*/
private static final String exchangeRateUrl = "http://webforex.hermes.hexun.com/forex/quotelist?";
/**
* 从和讯网获取汇率
......
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.constant.KeyConstant;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.master.util.*;
import com.diaoyun.zion.master.util.spider.SpiderUtil;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* nike数据爬虫
*
* @author G
*/
@Component("nikeItemSpider")
public class NikeItemSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(NikeItemSpider.class);
@Override
......@@ -30,12 +38,159 @@ public class NikeItemSpider implements IItemSpider {
//获取商品相关信息,详情放在<script> 标签的 window.INITIAL_REDUX_STATE 变量中
resultObj = JsoupUtil.getItemDetailByName(content, "window.INITIAL_REDUX_STATE");
//格式化为封装数据
ProductResponse productResponse = SpiderUtil.formatNikeProductResponse(resultObj);
ProductResponse productResponse = formatNikeProductResponse(resultObj);
resultObj = JSONObject.fromObject(productResponse);
//翻译
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化 nike 返回数据
*
* @param dataMap
* @return
*/
private ProductResponse formatNikeProductResponse(JSONObject dataMap) {
ProductResponse productResponse = new ProductResponse();
//nike 基本是 颜色、尺码属性
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
//原始价
List<OriginalPrice> originalPriceList = new ArrayList<>();
//促销价格
List<ProductPromotion> promotionList = new ArrayList<>();
//库存
DynStock dynStock = new DynStock();
//其实数据没有包含确切的库存数,这里默认给足量的库存
dynStock.setSellableQuantity(9999);
//商品基本信息
ItemInfo itemInfo = new ItemInfo();
JSONObject threadObj = dataMap.getJSONObject("Threads");
JSONObject productsObj = threadObj.getJSONObject("products");
Set es = productsObj.entrySet();
Iterator it = es.iterator();
while (it.hasNext()) {
Map.Entry<String, JSONObject> entry = (Map.Entry) it.next();
String skuStr = ";";
String modelCode = entry.getKey();
skuStr = skuStr + modelCode + ";";
JSONObject itemDetail = entry.getValue();
////////////////////////////////////获取价格和商品属性////////////////////////////////////////////
String fullPrice = itemDetail.getString("fullPrice");
//转换汇率
fullPrice = exchangeRate(fullPrice);
String currentPrice = itemDetail.getString("currentPrice");
//转换汇率
currentPrice = exchangeRate(currentPrice);
productResponse.setPrice(fullPrice);
JSONArray skusArr = itemDetail.getJSONArray("skus");
//获取商品尺码属性,同时记录下skuid和尺码关系
Map<String, String> sizeSkuIdMapping = new HashMap<>(16);
for (int i = 0; i < skusArr.size(); i++) {
String skuId = skusArr.getJSONObject(i).getString("skuId");
/////////////////////////获取商品尺码属性////////////////////
//商品属性
Set<ProductProp> sizePropSet = new HashSet<>();
ProductProp productProp = new ProductProp();
String localizedSize = skusArr.getJSONObject(i).getString("localizedSize");
String localizedSizePrefix = skusArr.getJSONObject(i).getString("localizedSizePrefix");
//因为尺码一样的时候skuid却不一样,这里只能赋予一个propid,否则后面去重不了
String customizeId = KeyConstant.CUSTOMIZE_ID + localizedSize + localizedSizePrefix;
sizeSkuIdMapping.put(skuId, customizeId);
productProp.setPropId(customizeId);
productProp.setPropName(localizedSizePrefix + " " + localizedSize);
sizePropSet.add(productProp);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
/////////////////////////获取商品尺码属性 END////////////////////
}
////////////////////////////////////获取价格//////////////////////////////////
for (int i = 0; i < skusArr.size(); i++) {
String skuId = skusArr.getJSONObject(i).getString("skuId");
String customizeId = sizeSkuIdMapping.get(skuId);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr + customizeId + ";");
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
if (itemDetail.getBoolean("discounted")) {
productResponse.setPromotionFlag(true);
productResponse.setSalePrice(currentPrice);
ProductPromotion productPromotion = new ProductPromotion();
productPromotion.setSkuStr(skuStr + customizeId + ";");
productPromotion.setPrice(fullPrice);
promotionList.add(productPromotion);
}
}
////////////////////////////////////获取价格 END//////////////////////////////////
/////////////////////////////////////获取价格和商品属性 END////////////////////////////////////////////
////////////////////////////////////获取库存 ////////////////////////////////////////////
productResponse.setStockFlag(true);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
JSONArray availableSkusArr = itemDetail.getJSONArray("availableSkus");
for (int i = 0; i < availableSkusArr.size(); i++) {
String skuId = availableSkusArr.getJSONObject(i).getString("skuId");
String customizeId = sizeSkuIdMapping.get(skuId);
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr + customizeId + ";");
productSkuStockList.add(productSkuStock);
}
dynStock.setProductSkuStockList(productSkuStockList);
////////////////////////////////////获取库存 END////////////////////////////////////////////
////////////////////////////////////获取商品颜色属性////////////////////////////////////////////
//商品属性
Set<ProductProp> propSet = new HashSet<>();
ProductProp productProp = new ProductProp();
String colorDescription = itemDetail.getString("colorDescription");
String firstImageUrl = itemDetail.getString("firstImageUrl");
productProp.setPropId(modelCode);
productProp.setPropName(colorDescription);
productProp.setImage(firstImageUrl);
propSet.add(productProp);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
////////////////////////////////////获取商品属性 END////////////////////////////////////////////
}
JSONObject globalObj = dataMap.getJSONObject("global");
JSONObject metaTagsObj = globalObj.getJSONObject("metaTags");
JSONArray metaArr = metaTagsObj.getJSONArray("meta");
for (int i = 0; i < metaArr.size(); i++) {
if (metaArr.getJSONObject(i).get("property") != null) {
String propertyValue = metaArr.getJSONObject(i).getString("property");
if ("og:title".equalsIgnoreCase(propertyValue)) {
itemInfo.setTitle(metaArr.getJSONObject(i).getString("content"));
}
if ("og:image".equalsIgnoreCase(propertyValue)) {
itemInfo.setPic(metaArr.getJSONObject(i).getString("content"));
}
}
}
itemInfo.setShopUrl("https://www.nike.com/cn/");
itemInfo.setShopName(PlatformEnum.NIKE.getLabel());
productResponse.setPropFlag(true);
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform(PlatformEnum.NIKE.getValue());
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,18 +2,22 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.SpiderUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.OchirlySpiderParse;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
......@@ -26,14 +30,8 @@ import java.util.concurrent.TimeoutException;
public class OchirlySpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* OchirlySpider 商品详情页Url
*/
private static final String URBANREVIVO_URL = "http://www.ochirly.com.cn/p/mobile/";
/**
* 爬虫数据返回
* @see OchirlySpiderParse#formatProductResponse 格式化方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
* @see
......@@ -41,11 +39,149 @@ public class OchirlySpider implements IItemSpider {
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.UNDERARMOUR.getValue());
ProductResponse productResponse = OchirlySpiderParse.formatProductResponse(content);
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
// 翻译
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
Document document = Jsoup.parse(content);
// 获取标题
Elements detailEle = document.select("div[class=detail]").select("div[class=desc]");
String pTitle = detailEle.select("h5").text();
// 获取价格
Elements priceEle = detailEle.select("p[class=price]");
String fullPrice = priceEle.attr("data-list-price");
// 获取颜色id与图片
Elements colorEle = document.select("div[class=color]").select("ul[class=clearfix]");
List<String> imgUrlList = colorEle.select("a").eachAttr("href");
List<String> pColorNoList = new ArrayList<>();
for (int i = 0; i < imgUrlList.size(); i++) {
String hrefStr = imgUrlList.get(i);
if (hrefStr.contains("/p/mobile/")) {
String[] spilt = hrefStr.split("/mobile/");
pColorNoList.add(spilt[1].replaceAll(".shtml", ""));
} else {
pColorNoList.add(0, priceEle.attr("data-sku"));
}
}
List<String> pColorList = new ArrayList<>();
pColorList.addAll(pColorNoList);
List<String> pImgList = colorEle.select("img").eachAttr("src");
// 获取尺码
Elements sizeEle = document.select("div[class=size]").select("div[class=size_contain]").select("li");
List<String> pSizeList = new ArrayList<>();
List<String> pSizeNoList = new ArrayList<>();
for (Element element : sizeEle) {
if (element.hasAttr("data-size-id")) {
pSizeList.add(element.text());
pSizeNoList.add(element.attr("data-size-id"));
}
}
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setShopName("Ochirly");
itemInfo.setShopUrl("www.ochirly.com");
itemInfo.setItemId(detailEle.select("p[class=price]").attr("data-sku"));
itemInfo.setTitle(pTitle);
itemInfo.setPic(pImgList.get(0));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
for (int i = 0; i < pColorList.size(); i++) {
String colorNo = pColorList.get(i);
String color = pColorNoList.get(i);
String imgUrl = pImgList.get(i);
ProductProp productPropColor = new ProductProp();
productPropColor.setPropName(colorNo);
productPropColor.setPropId(color);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
///////////////////////// 获取商品尺码属性 ////////////////////
for (int j = 0; j < pSizeList.size(); j++) {
String sizeNo = pSizeNoList.get(j);
String size = pSizeList.get(j);
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 设置库存id
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
// TODO 转换汇率,目前商品单位是人民币
String originalFullPrice = SpiderUtil.exchangeRate(fullPrice);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(originalFullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(originalFullPrice);
productResponse.setSalePrice(originalFullPrice + "-" + originalFullPrice);
}
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Ochirly");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* Oysho 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("oyshoSpider")
public class OyshoSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(EifiniSpider.class);
/**
* Oysho 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws InterruptedException, IOException, ExecutionException, URISyntaxException, TimeoutException {
String pId = "";
if (targetUrl.contains("origenId")) {
String[] spilt = targetUrl.split("origenId=");
pId = spilt[1];
} else {
String[] spilt = targetUrl.split("p");
spilt = spilt[2].split(".html");
pId = spilt[0].replaceAll(".html", "");
}
targetUrl = "https://www.oysho.cn/itxrest/2/catalog/store/65009628/60361118/category/0/product/" + pId + "/detail";
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.OYSHO.getValue());
ProductResponse productResponse = formatProductResponse(content, pId);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
*
* @param content 主要的页面数据
* @param pId 截取链接中的商品 id
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
JSONObject dataMap = JSONObject.fromObject(content);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setItemId(pId);
itemInfo.setShopName("Oysho");
itemInfo.setShopUrl("https://www.oysho.cn/");
itemInfo.setTitle(dataMap.getString("name"));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
// color 数组节点在接口传递的 json 中会存在不同的情况
System.err.println(dataMap.getJSONArray("bundleProductSummaries").size());
JSONArray colorArr;
if (dataMap.getJSONArray("bundleProductSummaries").size() != 0) {
colorArr = dataMap.getJSONArray("bundleProductSummaries").getJSONObject(0).getJSONObject("detail").getJSONArray("colors");
} else {
colorArr = dataMap.getJSONObject("detail").getJSONArray("colors");
}
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
for (int i = 0; i < colorArr.size(); i++) {
JSONObject colorObj = colorArr.getJSONObject(i);
String colorNo = colorObj.getString("id");
String color = colorObj.getString("name");
// 处理图片路径
JSONObject imageObj = colorObj.getJSONObject("image");
String imgUrl = "https://static.oysho.cn/6/photos2"
+ imageObj.getString("url") + ".jpg?t=" + imageObj.getString("timestamp");
if (i == 0) {
itemInfo.setPic(imgUrl);
}
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
JSONArray sizesArr = colorObj.getJSONArray("sizes");
for (int j = 0; j < sizesArr.size(); j++) {
JSONObject sizesObj = sizesArr.getJSONObject(j);
String sizeNo = sizesObj.getString("sku");
String size = sizesObj.getString("name");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END///////////////////////////////////////////////////
//////////////////////////////////// 获取库存与原始价 ////////////////////////////////////////////
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
String fullPrice = sizesObj.getString("price");
BigDecimal priceOld = new BigDecimal(fullPrice);
BigDecimal div = new BigDecimal("100");
fullPrice = priceOld.divide(div, 2, BigDecimal.ROUND_DOWN).toString();
String originalFullPrice = SpiderUtil.exchangeRate(fullPrice);
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(originalFullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(originalFullPrice);
productResponse.setSalePrice(originalFullPrice + "-" + originalFullPrice);
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Oysho");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* Prada(普拉达) 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("pradaSpider")
public class PradaSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* Prada(普拉达) 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.PRADA.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 解析为 Document 对象
Document document = Jsoup.parse(content);
Elements pdpNameEle = document.select("div[class=col-xs-12 col-sm-12 pdp-name]");
// 获取价格
String fullPrice = pdpNameEle.select("p[class=pdp-price]").text();
fullPrice = SpiderUtil.retainNumber(fullPrice);
fullPrice = SpiderUtil.exchangeRate(fullPrice);
//////////////////////////////////// 获取商品基本信息 //////////////////////////////////////////////////
itemInfo.setShopName("Prada");
itemInfo.setShopUrl("https://www.prada.com/");
itemInfo.setItemId(pdpNameEle.select("div[class=pdp-sku]").text());
itemInfo.setTitle(pdpNameEle.select("h1").text());
//////////////////////////////////// 获取商品基本信息End ///////////////////////////////////////////////
//////////////////////////////////// 获取商品颜色属性 //////////////////////////////////////////////////
Elements colorEle = document.select("div[class=stiky-style-images]").select("a");
Elements sizeEle = document.select("div[class=product-size]").select("ul").select("li");
itemInfo.setPic(colorEle.select("img[class=img-style selected]").attr("src"));
for (Element colorElement : colorEle) {
String colorNo = colorElement.attr("data-part-number");
String imgUrl = colorElement.select("img").attr("src");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropName(colorNo);
productPropColor.setPropId(colorNo);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
for (Element sizeElement : sizeEle) {
String sizeNo = sizeElement.select("input").attr("data-unique-id");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(sizeElement.select("p[class=number]").text());
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
//////////////////////////////////// 获取库存与原始价 ////////////////////////////////////////////
// 设置库存id
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
// 设置:商品包含库存信息
ProductSkuStock productSkuStock = new ProductSkuStock();
OriginalPrice originalPrice = new OriginalPrice();
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
// TODO 转换汇率,目前商品单位是人民币
String originalFullPrice = fullPrice;
originalPrice.setPrice(originalFullPrice);
productResponse.setPrice(originalFullPrice);
productResponse.setSalePrice(originalFullPrice + "-" + originalFullPrice);
originalPrice.setSkuStr(skuStr);
originalPriceList.add(originalPrice);
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Prada");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,17 +2,20 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.SpiderUtil;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
......@@ -23,45 +26,163 @@ import java.util.concurrent.TimeoutException;
*/
@Component("pullandbearSpider")
public class PullandbearSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* PullAndBear 商品详情链接
*/
private static final String PULL_AND_BEAR_URL="https://www.pullandbear.cn/itxrest/2/catalog/store/24009528/20309423/category/0/product/";
/**
* PullAndBear 数据爬虫
* @see SpiderUtil#formatPullAndBearProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String pId = targetUrl.substring(targetUrl.lastIndexOf("p")+1, targetUrl.lastIndexOf(".html"));
targetUrl = PULL_AND_BEAR_URL + pId + "/detail?languageId=-7&appId=1";
targetUrl = "https://www.pullandbear.cn/itxrest/2/catalog/store/24009528/20309423/category/0/product/" + pId + "/detail?languageId=-7&appId=1";
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.PULLANDBEAR.getValue());
JSONObject resultJson = JSONObject.fromObject(content);
ProductResponse productResponse = SpiderUtil.formatPullAndBearProductResponse(resultJson, pId);
ProductResponse productResponse = formatProductResponse(resultJson, pId);
resultJson = JSONObject.fromObject(productResponse);
// 翻译
TranslateHelper.translateProductResponse(resultJson);
return resultJson;
}
// /**
// * PullAndBear 获取商品详情数据的方式
// * @param args
// * @throws Exception
// */
// public static void main(String[] args) throws Exception {
// String targetUrl = "https://www.pullandbear.cn/cn/%25E7%2594%25B7%25E8%25A3%2585/%25E6%259C%258D%25E8%25A3%2585/%25E5%25A4%25A7%25E8%25A1%25A3%25E5%2592%258C%25E5%25A4%25B9%25E5%2585%258B/cazadora-tipo-plumas-costuras-invisibles-c-capucha-c1030204837p501658014.html?cS=800";
// String pId = targetUrl.substring(targetUrl.lastIndexOf("p")+1, targetUrl.lastIndexOf(".html"));
// targetUrl = PULL_AND_BEAR_URL + pId + "/detail?languageId=-7&appId=1";
// String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.PULLANDBEAR.getValue());
// System.err.println(content);
// }
/**
* 格式化 PullAndBear 返回数据
* @see com.diaoyun.zion.chinafrica.bis.impl.PullandbearSpider
* @param dataMap 主要的 json 数据
* @param pId 商品链接的 id
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSetColor = new HashSet<>(16);
Set<ProductProp> sizePropSetSize = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 取 bundleProductSummaries 的节点对象
JSONObject bundleProductSummariesObj = dataMap.getJSONArray("bundleProductSummaries").getJSONObject(0);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
itemInfo.setShopName("PullAndBear");
itemInfo.setShopUrl("https://www.pullandbear.cn/");
itemInfo.setItemId(pId);
itemInfo.setTitle(bundleProductSummariesObj.getString("name"));
//////////////////////////////////// 获取商品基本信息End(图片下取) ////////////////////////////////////////////
// 取 colors 数组节点
JSONArray colorsArr = bundleProductSummariesObj.getJSONObject("detail").getJSONArray("colors");
productResponse.setStockFlag(true);
for (int i = 0; i < colorsArr.size(); i++) {
JSONObject colorsObj = colorsArr.getJSONObject(i);
//////////////////////////////////// 获取商品颜色与图片属性 ////////////////////////////////////////////
JSONObject imageObj = colorsObj.getJSONObject("image");
String colorNo = colorsObj.getString("id");
String color = colorsObj.getString("name");
String imageUrl = "https://static.pullandbear.cn/2/photos/"
+ imageObj.getString("url")
+ "_2_1_8.jpg?t="
+ imageObj.getString("timestamp");
if (i == 0) {
itemInfo.setPic(imageUrl);
}
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imageUrl);
propSetColor.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSetColor);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSetColor.addAll(oldPropSet);
productPropSet.put("颜色", propSetColor);
}
//////////////////////////////////// 获取商品颜色与图片属性 END ////////////////////////////////////////////
// 取 siezes 对象数组
JSONArray sizesArr = colorsObj.getJSONArray("sizes");
for (int j = 0; j < sizesArr.size(); j++) {
JSONObject sizesObj = sizesArr.getJSONObject(j);
///////////////////////// 获取商品尺码属性 ////////////////////
String sizeNo = sizesObj.getString("sku");
String size = sizesObj.getString("name");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropName(size);
productPropSize.setPropId(sizeNo);
sizePropSetSize.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSetSize);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSetSize.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSetSize);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 商品的库存id
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
// 获取商品的原始价
String fullPrice = sizesObj.getString("price");
BigDecimal priceOld = new BigDecimal(fullPrice);
BigDecimal div = new BigDecimal("100");
BigDecimal priceNew = priceOld.divide(div, 2, BigDecimal.ROUND_DOWN);
// TODO 转换汇率,目前商品单位是人民币
fullPrice = SpiderUtil.exchangeRate(priceNew.toString());
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
dynStock.setProductSkuStockList(productSkuStockList);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setPropFlag(true);
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("PullAndBear");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......
package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* Revolve 数据爬虫
*
* @author 爱酱油不爱醋
*/
@Component("revolveSpider")
public class RevolveSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(PullandbearSpider.class);
/**
* Revolve 数据爬虫
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.APPLE.getValue());
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的网页内容
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 转换为 Document 对象
Document document = Jsoup.parse(content);
//////////////////////////////////// 获取商品基本信息 //////////////////////////////////////////////////
itemInfo.setItemId(document.select("input[id=productCode]").attr("value"));
itemInfo.setShopName("Revolve");
itemInfo.setShopUrl("http://www.revolve.com");
itemInfo.setTitle(document.select("meta[property=og:title]").attr("content"));
//////////////////////////////////// 获取商品基本信息End ///////////////////////////////////////////////
Elements colorsEle = document.select("fieldset[aria-labelledby=color-sr-text]")
.select("ul[id=product-swatches]").select("li");
Elements sizesEle = document.select("div[class=product-sizes product-sections]")
.select("ul[id=size-ul]").select("li").select("input");
String fullPrice = document.select("meta[property=wanelo:product:price]").attr("content");
// 判断货币类型
if (!"USD".equals(document.select("meta[property=wanelo:product:price:currency]").attr("content"))) {
fullPrice = SpiderUtil.exchangeRate(fullPrice);
}
for (Element colorEle : colorsEle) {
String colorNo = colorEle.attr("data-swatch-code");
String color = colorEle.select("img").attr("alt");
String imgUrl = colorEle.select("img").attr("src");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
for (Element sizeEle : sizesEle) {
String sizeNo = sizeEle.attr("value");
String size = sizeEle.attr("data-size");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 商品的库存id
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
// 获取库存数
int sellableQuantity = Integer.valueOf(sizeEle.attr("data-qty"));
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(sellableQuantity);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END///////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
// TODO 转换汇率,目前商品单位是人民币
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Revolve");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -21,7 +21,7 @@ public class StripePay {
*/
public static Charge createCharge(Integer amount,String sk,String token) throws StripeException {
Stripe.apiKey = sk;
Map<String, Object> chargeParams = new HashMap<String, Object>();
Map<String, Object> chargeParams = new HashMap<>(16);
chargeParams.put("amount", amount);
chargeParams.put("currency", "usd");
// 会出现在付款后页面
......
......@@ -29,7 +29,9 @@ import java.util.concurrent.TimeoutException;
public class TbItemSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(TbItemSpider.class);
//淘宝商品详情
/**
* 淘宝商品详情页链接
*/
private static final String taobaoUrl="https://item.taobao.com/item.htm?";
@Override
......@@ -80,8 +82,6 @@ public class TbItemSpider implements IItemSpider {
return returnJson;
}
/**
* 翻译规格属性
* @param propMap 规格属性MAP
......@@ -127,8 +127,6 @@ public class TbItemSpider implements IItemSpider {
}
}
/**
* 去除需要登录或者不需要返回的参数
* @param usableSibUrl
......
......@@ -2,7 +2,6 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.master.thread.TaskLimitSemaphore;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.JsoupUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
......@@ -13,6 +12,7 @@ import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.ArrayList;
......@@ -23,12 +23,16 @@ import java.util.concurrent.TimeoutException;
/**
* 天猫数据爬虫
*
* @author G
*/
@Component("tmItemSpider")
public class TmItemSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(TmItemSpider.class);
//天猫链接
/**
* 天猫商品详情页链接
*/
private static final String tmallUrl="https://detail.m.tmall.com/item.htm?";
@Override
......@@ -80,41 +84,4 @@ public class TmItemSpider implements IItemSpider {
return returnJson;
}
/**
* 主要是提取相关参数,组成新的url
* @param targetUrl
* @return
*/
/*@Deprecated
private String processUrl(String targetUrl) throws URISyntaxException, MalformedURLException {
String newUrl=tmallUrl;
//if(targetUrl.contains("h5.m.taobao.com")) {
//替换会影响参数的字符
targetUrl= targetUrl.replaceAll("\\{","");
targetUrl= targetUrl.replaceAll("\\}","");
Map<String,String> paramMap=HttpClientUtil.getParamMap(targetUrl);
//目前淘宝需要四个参数 spm id scm pvid
//引起错误的 参数 ali_refid
*//*for(Map.Entry<String,String> entry:paramMap.entrySet()) {
if("id".equals(entry.getKey())) {
newUrl=newUrl.replace("itemId",entry.getValue());
break;
}
}*//*
StringBuffer paramBuffer=new StringBuffer();
for(Map.Entry<String,String> entry:paramMap.entrySet()) {
if("ali_refid".equals(entry.getKey())||"track_params".equals(entry.getKey())||"utparam".equals(entry.getKey())
||"rmdChannelCode".equals(entry.getKey())||"locate".equals(entry.getKey())) {
} else {
paramBuffer.append(entry.getKey()+"="+entry.getValue()+"&");
}
}
return newUrl+paramBuffer.toString();
}*/
}
......@@ -2,18 +2,21 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.UnderArmourSpiderParse;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
......@@ -26,14 +29,8 @@ import java.util.concurrent.TimeoutException;
public class UnderArmourSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* UnderArmour 商品详情页Url
*/
private static final String URBANREVIVO_URL = "https://www.underarmour.cn/p";
/**
* 爬虫数据返回
* @see UnderArmourSpiderParse#formatProductResponse 数据格式化
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
* @see
......@@ -41,11 +38,136 @@ public class UnderArmourSpider implements IItemSpider {
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.UNDERARMOUR.getValue());
ProductResponse productResponse = UnderArmourSpiderParse.formatProductResponse(content);
ProductResponse productResponse = formatProductResponse(content);
if (productResponse.getItemInfo() == null) {
JSONObject notFundData = new JSONObject();
notFundData.put("message", "找不到此类网址的数据爬虫!");
return notFundData;
}
JSONObject resultObj = JSONObject.fromObject(productResponse);
// 翻译
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 解析成 Document 对象
Document document = Jsoup.parse(content);
// 如果未获取到该网站的商品id信息,说明还未在商品详情页则返回未找到爬虫信息
String pId = document.select("div[id=SKU]").text();
if ("".equals(pId) || pId == null) {
return productResponse;
}
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
itemInfo.setItemId(pId);
itemInfo.setShopName("UNDERARMOUR");
itemInfo.setShopUrl("https://www.underarmour.cn");
String pTitle = document.select("h3[class=commo-name]").text();
itemInfo.setTitle(pTitle);
String pPic = document.select("span[class=e-color-show]").text();
itemInfo.setPic("https://underarmour.scene7.com/is/image/Underarmour/V5-" + pPic + "_FC_Main");
//////////////////////////////////// 获取商品基本信息End /////////////////////////
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
Elements colorEle = document.select("ul[class=color-choice float-clearfix e-color-choice]").select("li");
List<String> pColorList = colorEle.eachText();
List<String> pColorNoList = colorEle.eachAttr("itemcode");
for (int i = 0; i < pColorList.size(); i++) {
ProductProp productPropColor = new ProductProp();
productPropColor.setPropName(pColorList.get(i));
productPropColor.setPropId(pColorNoList.get(i));
productPropColor.setImage("https://underarmour.scene7.com/is/image/Underarmour/V5-" + pColorNoList.get(i) + "_FC_Main");
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
Elements sizeEle = document.select("ul[class=size-choice float-clearfix e-size-choice]").select("li");
List<String> pSizeList = sizeEle.eachText();
List<String> pSizeNoList = sizeEle.eachAttr("skuid");
for (int i = 0; i < pSizeList.size(); i++) {
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(pSizeNoList.get(i));
productPropSize.setPropName(pSizeList.get(i));
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
}
///////////////////////// 获取商品尺码属性 END///////////////////////////////////////////////////
//////////////////////////////////// 获取库存与原始价 ////////////////////////////////////////////
String fullPrice = document.select("p[class=commo-price]").text().replaceAll("¥", "");
// TODO 转换汇率,目前商品单位是人民币
String originalFullPrice = SpiderUtil.exchangeRate(fullPrice);
for (String pColorNo : pColorNoList) {
for (String pSizeNo : pSizeNoList) {
// 设置库存id
String skuStr = ";" + pColorNo + ";" + pSizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setPrice(originalFullPrice);
originalPrice.setSkuStr(skuStr);
originalPriceList.add(originalPrice);
productResponse.setPrice(originalFullPrice);
productResponse.setSalePrice(originalFullPrice + "-" + originalFullPrice);
}
}
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform(PlatformEnum.UNDERARMOUR.getValue());
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
......@@ -2,10 +2,10 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.UrbanRevivoSpiderParse;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
......@@ -13,9 +13,12 @@ import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* UrbanRevivo 数据爬虫
*
......@@ -25,14 +28,8 @@ import java.util.concurrent.TimeoutException;
public class UrbanRevivoSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* UrbanRevivo 商品详情页Url
*/
private static final String UrbanRevivo_URL = "http://wap.ur.com.cn/product/detail";
/**
* UrbanRevivo 数据爬虫
* @see UrbanRevivoSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
......@@ -47,11 +44,137 @@ public class UrbanRevivoSpider implements IItemSpider {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.URBANREVIVO.getValue());
JSONObject resultObj = JSONObject.fromObject(content);
// 格式化数据
ProductResponse productResponse = UrbanRevivoSpiderParse.formatProductResponse(resultObj, pId);
ProductResponse productResponse = formatProductResponse(resultObj, pId);
resultObj = JSONObject.fromObject(productResponse);
// 翻译数据
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param dataMap 主要的 json 数据
* @param pId 截取的商品 id
* @return 格式化后的数据
*/
private ProductResponse formatProductResponse(JSONObject dataMap, String pId) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
// 取 data 数据节点
JSONObject dataObj = dataMap.getJSONObject("data");
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
itemInfo.setShopName("UrbanRevivo");
itemInfo.setShopUrl("http://www.ur.cn/index.html");
itemInfo.setItemId(pId);
itemInfo.setTitle(dataObj.getString("name"));
itemInfo.setPic("https://gw-img.ur.com.cn//" + dataObj.getString("image"));
//////////////////////////////////// 获取商品基本信息End////////////////////////////////////////////
// 获取商品的原始价
String fullPrice = dataObj.getString("tagPrice");
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(fullPrice);
// 取 colors 节点数组
JSONArray colorsArr = dataObj.getJSONArray("colors");
for (int i = 0; i < colorsArr.size(); i++) {
JSONObject colorsObj = colorsArr.getJSONObject(i);
// 获取图片路径
String imgUrl = "https://gw-img.ur.com.cn//" + colorsObj.getString("image");
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
String colorNo = colorsObj.getString("productColorId");
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(colorsObj.getString("aliasName"));
productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
// 取 skus 节点数组
JSONArray skusArr = colorsObj.getJSONArray("skus");
for (int j = 0; j < skusArr.size(); j++) {
JSONObject skusObj = skusArr.getJSONObject(j);
///////////////////////// 获取商品尺码属性 ////////////////////
String sizeNo = skusObj.getString("barCode");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(skusObj.getString("sizeAlias"));
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
// 设置库存id
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("UrbanRevivo");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
\ No newline at end of file
......@@ -2,21 +2,22 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.SpiderUtil;
import com.diaoyun.zion.master.util.spider.VansSpiderParse;
import com.diaoyun.zion.master.util.SpiderUtil;
import net.sf.json.JSONObject;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
......@@ -31,22 +32,130 @@ public class VansSpider implements IItemSpider {
/**
* Vans 数据爬虫
* @see VansSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.VANS.getValue());
Document document = Jsoup.parse(content);
String pTitle = document.select("product-titles").text();
String[] spilt = targetUrl.split("/");
String pId = SpiderUtil.retainNumber(spilt[4]);
targetUrl = "https://" + spilt[2] + "/wap/product-ajax_product_spec-" + pId + ".html";
content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.VANS.getValue());
ProductResponse productResponse = VansSpiderParse.formatProductResponse(content, pId, pTitle);
ProductResponse productResponse = formatProductResponse(content);
JSONObject resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化返回数据
* @param content 主要的页面数据
* @return 格式化后的数据
*/
public static ProductResponse formatProductResponse(String content) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 含有商品的属性,设置为true
productResponse.setPropFlag(true);
// 库存信息,如果没有可使用的库存信息则默认为999
DynStock dynStock = new DynStock();
dynStock.setSellableQuantity(9999);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
// 产品的原始价与优惠价
List<OriginalPrice> originalPriceList = new ArrayList<>();
List<ProductPromotion> promotionList = new ArrayList<>();
// 商品的属性,常用的商品属性为颜色与尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
Set<ProductProp> sizePropSet = new HashSet<>(16);
productResponse.setStockFlag(true);
// 商品的基本属性
ItemInfo itemInfo = new ItemInfo();
Document document = Jsoup.parse(content);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////
// itemInfo.setItemId(document.select(""));
itemInfo.setShopName("Vans");
itemInfo.setShopUrl("https://vans.com");
itemInfo.setTitle(document.select("meta[property=og:title]").attr("content"));
//////////////////////////////////// 获取商品基本信息End /////////////////////////
String fullPrice = document.select("meta[property=product:price:amount]").attr("content");
fullPrice = SpiderUtil.exchangeRate(fullPrice);
Elements pContentEle = document.select("div[id=product-content]").select("ul[class=dropdown-content]");
Elements colorsEle = pContentEle.select("ul[class=swatches Color]").select("a");
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////
for (Element colorEle : colorsEle) {
String colorNo = colorEle.attr("data-variationparameter");
String color = colorEle.absUrl("title");
// TODO 图片路径未处理
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorNo);
productPropColor.setPropName(color);
// productPropColor.setImage(imgUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////
///////////////////////// 获取商品尺码属性 ///////////////////////////////////////////////////////
Elements sizesEle = pContentEle.select("ul[class=swatches size]").select("a");
for (Element sizeEle : sizesEle) {
String sizeNo = sizeEle.attr("data-variationparameter");
String size = sizeEle.attr("title");
ProductProp productPropSize = new ProductProp();
productPropSize.setPropId(sizeNo);
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END///////////////////////////////////////////////////
//////////////////////////////////// 获取库存与原始价 ////////////////////////////////////////////
String skuStr = ";" + colorNo + ";" + sizeNo + ";";
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
productSkuStock.setSkuStr(skuStr);
productSkuStock.setSellableQuantity(999);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
OriginalPrice originalPrice = new OriginalPrice();
originalPrice.setSkuStr(skuStr);
originalPrice.setPrice(fullPrice);
originalPriceList.add(originalPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
//////////////////////////////////// 获取库存与原始价 END///////////////////////////////
}
}
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform("Vans");
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -2,22 +2,26 @@ package com.diaoyun.zion.chinafrica.bis.impl;
import com.diaoyun.zion.chinafrica.bis.IItemSpider;
import com.diaoyun.zion.chinafrica.enums.PlatformEnum;
import com.diaoyun.zion.chinafrica.vo.ProductResponse;
import com.diaoyun.zion.chinafrica.vo.*;
import com.diaoyun.zion.master.util.HttpClientUtil;
import com.diaoyun.zion.master.util.TranslateHelper;
import com.diaoyun.zion.master.util.spider.ZaraSpiderParse;
import net.sf.json.JSONArray;
import net.sf.json.JSONObject;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URISyntaxException;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
import static com.diaoyun.zion.master.util.SpiderUtil.exchangeRate;
/**
* Zara西班牙时尚品牌数据爬虫
* Zara 数据爬虫
*
* @author 爱酱油不爱醋
*/
......@@ -25,28 +29,156 @@ import java.util.concurrent.TimeoutException;
public class ZaraSpider implements IItemSpider {
private static Logger logger = LoggerFactory.getLogger(ZaraSpider.class);
/**
* Zara 商品详情页Url
*/
private static final String ZARA_URL = "https://www.zara.cn/cn/zh/";
/**
* Zara 数据爬虫
* @see com.diaoyun.zion.chinafrica.service.impl.SpiderServiceImpl# 修改商品详情页路径
* @see ZaraSpiderParse#getJsonData 返回截取到的主要商品数据
* @see ZaraSpiderParse#formatProductResponse 格式化数据方法
* @param targetUrl 接收的商品详情路径
* @return 格式化与翻译后的 Json 数据
*/
@Override
public JSONObject captureItem(String targetUrl) throws URISyntaxException, IOException, ExecutionException, InterruptedException, TimeoutException {
JSONObject resultObj;
String content = HttpClientUtil.getContentByUrl(targetUrl, PlatformEnum.ZARA.getValue());
resultObj = ZaraSpiderParse.getJsonData(content);
ProductResponse productResponse = ZaraSpiderParse.formatProductResponse(resultObj);
int labelHeadIndex = content.indexOf("dataLayer");
int labelTailIndex = content.lastIndexOf(";window.zara.viewPayload");
content = content.substring(labelHeadIndex, labelTailIndex).replace("dataLayer = ", "");
JSONObject resultObj = JSONObject.fromObject(content);
ProductResponse productResponse = formatZaraProductResponse(resultObj);
resultObj = JSONObject.fromObject(productResponse);
TranslateHelper.translateProductResponse(resultObj);
return resultObj;
}
/**
* 格式化 Zara 返回数据
* @param dataMap 主要的 json 数据
* @return 格式化后的数据
*/
private ProductResponse formatZaraProductResponse(JSONObject dataMap) {
// 声明封装类
ProductResponse productResponse = new ProductResponse();
// 属性:Zara 的商品属性有颜色、尺码
Map<String, Set<ProductProp>> productPropSet = new HashMap<>(16);
// 原始价
List<OriginalPrice> originalPriceList = new ArrayList<>();
// 促销价格
List<ProductPromotion> promotionList = new ArrayList<>();
// 库存
DynStock dynStock = new DynStock();
// 其实数据没有包含确切的库存数,这里默认给足量的库存
dynStock.setSellableQuantity(9999);
//////////////////////////////////// 获取商品基本信息 ////////////////////////////////////////////
ItemInfo itemInfo = new ItemInfo();
itemInfo.setShopName("Zara");
itemInfo.setShopUrl(dataMap.getString("backUrl"));
JSONObject productObj = dataMap.getJSONObject("product");
itemInfo.setItemId(productObj.getString("id"));
itemInfo.setTitle(productObj.getString("name"));
//////////////////////////////////// 获取商品基本信息End(图片下取) ////////////////////////////////////////////
// 取 colors 节点数组
JSONArray colorsArr = productObj.getJSONObject("detail").getJSONArray("colors");
Set<ProductProp> sizePropSet = new HashSet<>(16);
Set<ProductProp> propSet = new HashSet<>(16);
productResponse.setStockFlag(true);
List<ProductSkuStock> productSkuStockList = dynStock.getProductSkuStockList();
for (int i = 0; i < colorsArr.size(); i++) {
JSONObject colorsObj = colorsArr.getJSONObject(i);
// 取 detailImagesArr 节点数组第一个对象
JSONObject detailImagesObj_0 = colorsObj.getJSONArray("detailImages").getJSONObject(0);
// 处理图片 参考路径:http://static.zara.cn/photos///2019/I/0/1/p/0858/457/800/17/w/1920/0858457800_1_1_1.jpg?ts=1570720340221
String imageUrl = "http://static.zara.cn/photos//"
+ detailImagesObj_0.getString("path")
+ "w/1920/"
+ detailImagesObj_0.getString("name")
+ "_1.jpg?ts="
+ detailImagesObj_0.getString("timestamp");
if (i == 0) {
// 商品基本信息--设置:图片
itemInfo.setPic(imageUrl);
}
//////////////////////////////////// 获取商品颜色属性 ////////////////////////////////////////////
ProductProp productPropColor = new ProductProp();
productPropColor.setPropId(colorsObj.getString("productId"));
productPropColor.setPropName(colorsObj.getString("name"));
productPropColor.setImage(imageUrl);
propSet.add(productPropColor);
if (productPropSet.get("颜色") == null) {
productPropSet.put("颜色", propSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("颜色");
propSet.addAll(oldPropSet);
productPropSet.put("颜色", propSet);
}
//////////////////////////////////// 获取商品颜色属性 END ////////////////////////////////////////////
// 取 sizes 节点数组
JSONArray sizesArr = colorsArr.getJSONObject(i).getJSONArray("sizes");
for (int j = 0; j < sizesArr.size(); j++) {
JSONObject sizesObj = sizesArr.getJSONObject(j);
// 库存对应的id(Zara 中以颜色id + 尺码id)
String skuStr = ";" + colorsObj.getString("productId") + ";" + sizesObj.getString("sku") + ";";
///////////////////////// 获取商品尺码属性 ////////////////////
ProductProp productPropSize = new ProductProp();
String size = sizesObj.getString("name");
productPropSize.setPropId(sizesObj.getString("sku"));
productPropSize.setPropName(size);
sizePropSet.add(productPropSize);
if (productPropSet.get("尺码") == null) {
productPropSet.put("尺码", sizePropSet);
} else {
Set<ProductProp> oldPropSet = productPropSet.get("尺码");
sizePropSet.addAll(oldPropSet);
productPropSet.put("尺码", sizePropSet);
}
///////////////////////// 获取商品尺码属性 END////////////////////
//////////////////////////////////// 获取库存 ////////////////////////////////////////////
// 设置:商品包含库存信息
if (productSkuStockList == null) {
productSkuStockList = new ArrayList<>();
}
ProductSkuStock productSkuStock = new ProductSkuStock();
// 设置:可用库存值,Zara 未有可用的库存数据
productSkuStock.setSellableQuantity(999);
productSkuStock.setSkuStr(skuStr);
productSkuStockList.add(productSkuStock);
dynStock.setProductSkuStockList(productSkuStockList);
//////////////////////////////////// 获取库存 END/////////////////////////////////////////
//////////////////////////////////// 获取原始价 //////////////////////////////////
OriginalPrice originalPrice = new OriginalPrice();
// 获取商品的原始价
String fullPrice = sizesObj.getString("price");
BigDecimal priceOld=new BigDecimal(fullPrice);
BigDecimal div = new BigDecimal("100");
BigDecimal priceNew = priceOld.divide(div, 2, BigDecimal.ROUND_DOWN);
// TODO 转换汇率,目前商品单位是人民币
fullPrice = exchangeRate(priceNew.toString());
originalPrice.setPrice(fullPrice);
productResponse.setPrice(fullPrice);
productResponse.setSalePrice(fullPrice + "-" + fullPrice);
originalPrice.setSkuStr(skuStr);
originalPriceList.add(originalPrice);
//////////////////////////////////// 获取原始价 END//////////////////////////////////
}
}
// 按照一下顺序进行 json 数据的填充
productResponse.setPropFlag(true);
productResponse.setProductPropSet(productPropSet);
productResponse.setPlatform(PlatformEnum.ZARA.getValue());
productResponse.setPromotionList(promotionList);
productResponse.setOriginalPriceList(originalPriceList);
productResponse.setItemInfo(itemInfo);
productResponse.setDynStock(dynStock);
return productResponse;
}
}
......@@ -10,6 +10,9 @@ import com.diaoyun.zion.master.enums.EnumItemable;
*/
public enum CouponCategoryEnum implements EnumItemable<CouponCategoryEnum> {
/**
* 优惠卷类型枚举
*/
SHOP("购物返券", 10),
REGISTER("注册", 20),
INVITE("邀请", 30);
......@@ -17,16 +20,17 @@ public enum CouponCategoryEnum implements EnumItemable<CouponCategoryEnum> {
private String label;
private Integer value;
CouponCategoryEnum(String label, Integer value) {
this.label = label;
this.value = value;
}
@Override
public String getLabel() {
return this.label;
}
@Override
public Integer getValue() {
return this.value;
}
......
......@@ -10,6 +10,9 @@ import com.diaoyun.zion.master.enums.EnumItemable;
*/
public enum DeliveryStatusEnum implements EnumItemable<DeliveryStatusEnum> {
/**
* 发货状态枚举
*/
PROCESSING("等待处理", 0),
PURCHASE("已经代购", 10),
ON_LOAD("正在配送", 20),
......@@ -20,16 +23,17 @@ public enum DeliveryStatusEnum implements EnumItemable<DeliveryStatusEnum> {
private String label;
private Integer value;
DeliveryStatusEnum(String label, Integer value) {
this.label = label;
this.value = value;
}
@Override
public String getLabel() {
return this.label;
}
@Override
public Integer getValue() {
return this.value;
}
......
......@@ -10,6 +10,9 @@ import com.diaoyun.zion.master.enums.EnumItemable;
*/
public enum OrderStatusEnum implements EnumItemable<OrderStatusEnum> {
/**
* 订单状态枚举
*/
CANCEL("取消", 0),
PENDING_PAY("等待付款", 10),
PAID("已付款", 20),
......@@ -20,7 +23,6 @@ public enum OrderStatusEnum implements EnumItemable<OrderStatusEnum> {
private String label;
private Integer value;
OrderStatusEnum(String label, Integer value) {
this.label = label;
this.value = value;
......
......@@ -19,19 +19,31 @@ public enum PlatformEnum implements EnumItemable<PlatformEnum> {
GAP("GAP", "gap"),
ZARA("Zara", "zara"),
UNIQLO("优衣库", "uniqlo"),
NIKE("NIKE", "nike"),
NIKE("耐克", "nike"),
ADIDAS("阿迪达斯", "adidas"),
HM("H&M", "hm"),
LILY("Lily", "lily"),
EIFINI("伊芙丽", "eifini"),
URBANREVIVO("UrbanRevivo", "urbanrevivo"),
ABERCROMBIEFITCH("Abercrombie&Fitch", "aberCrombieFitch"),
UNDERARMOUR("安德玛", "underarmour"),
OCHIRLY("Ochirly", "ochirly"),
OCHIRLY("欧时力", "ochirly"),
ESPRIT("思捷", "esprit"),
LEVI("李维斯", "levi"),
MOCO("MO&Co.", "moco"),
MASSIMODUTTI("MassimoDutti", "massimodutti"),
COACH("蔻驰", "coach"),
REVOLVE("Revolve", "revolve"),
VANS("范斯", "vans"),
OYSHO("Oysho", "oysho"),
STRADIVARIUS("斯特拉迪瓦里斯", "stradivarius"),
MAJE("Maje", "maje"),
GUCCI("古驰", "gucci"),
BURBERRY("博柏利", "burberry"),
PRADA("普拉达", "prada"),
FENDI("芬迪", "fendi"),
APPLE("苹果", "apple"),
LOUISVUITTON("路易威登LV", "louisVuitton"),
UN("未知", "un"),
AfriEshop("afri-eshop","afri-eshop" );
......
......@@ -59,10 +59,18 @@ public class ItemSpiderFactory {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("lilySpider");
break;
}
case "eifini": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("eifiniSpider");
break;
}
case "urbanrevivo": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("urbanrevivoSpider");
break;
}
case "aberCrombieFitch": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("aberCrombieFitchSpider");
break;
}
case "underarmour": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("underarmourSpider");
break;
......@@ -91,10 +99,50 @@ public class ItemSpiderFactory {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("coachSpider");
break;
}
case "revolve": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("revolveSpider");
break;
}
case "vans": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("vansSpider");
break;
}
case "oysho": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("oyshoSpider");
break;
}
case "stradivarius": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("stradivariusSpider");
break;
}
case "maje": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("majeSpider");
break;
}
case "gucci": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("gucciSpider");
break;
}
case "burberry": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("gucciSpider");
break;
}
case "prada": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("pradaSpider");
break;
}
case "fendi": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("fendiSpider");
break;
}
case "apple": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("appleSpider");
break;
}
case "louisVuitton": {
iItemSpider = (IItemSpider) SpringContextUtil.getBean("louisVuittonSpider");
break;
}
case "afri-eshop":{
iItemSpider = (IItemSpider) SpringContextUtil.getBean("africaShopItemSpider");
break;
......
......@@ -7,33 +7,41 @@ import com.diaoyun.zion.chinafrica.factory.ItemSpiderFactory;
import com.diaoyun.zion.chinafrica.service.SpiderService;
import net.sf.json.JSONObject;
import org.apache.commons.lang3.StringUtils;
import org.apache.http.NameValuePair;
import org.apache.http.client.utils.URLEncodedUtils;
import org.springframework.stereotype.Service;
import java.io.IOException;
import java.math.BigDecimal;
import java.net.URI;
import java.net.URISyntaxException;
import java.nio.charset.Charset;
import java.util.*;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeoutException;
/**
* 爬虫服务类
*
* @author G
*/
@Service("spiderService")
public class SpiderServiceImpl implements SpiderService {
/**
* 判断链接属于哪个平台
* @param targetUrl 在商品详情页截取到的路径
* @return 对应的爬虫
*/
@Override
public JSONObject getItemDetail(String targetUrl) throws InterruptedException, IOException, ExecutionException, URISyntaxException, TimeoutException {
//判断链接属于哪个平台
PlatformEnum platformEnum=judgeUrlType(targetUrl);
IItemSpider iItemSpider=ItemSpiderFactory.getSpider(platformEnum);
return iItemSpider.captureItem(targetUrl);
}
/**
* 获取汇率
* @param currency FOREXUSDCNY 人民币换美元
* @return 汇率
*/
@Override
public BigDecimal getExchangeRate(String currency) throws IOException, URISyntaxException {
//默认人民币换美元
if(StringUtils.isBlank(currency)) {
currency="FOREXUSDCNY";
......@@ -41,13 +49,18 @@ public class SpiderServiceImpl implements SpiderService {
return NetWorkSpider.getRateFromHexun(currency);
}
/**
* 判断链接属于哪个平台
* @param targetUrl 在商品详情页截取到的路径
* @return 对应的爬虫
*/
private PlatformEnum judgeUrlType(String targetUrl) {
PlatformEnum platformEnum = PlatformEnum.UN;
if (targetUrl.contains("taobao.com") && (targetUrl.contains("item.htm") || targetUrl.contains("detail.htm"))) {
platformEnum = PlatformEnum.TB;
} else if (targetUrl.contains("tmall.com/item.htm")) {
platformEnum = PlatformEnum.TM;
} else if (targetUrl.contains("pullandbear.cn/")) {
} else if (targetUrl.contains("pullandbear")) {
platformEnum = PlatformEnum.PULLANDBEAR;
} else if(targetUrl.contains("www.gap.cn/pdp/")) {
platformEnum=PlatformEnum.GAP;
......@@ -55,9 +68,9 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum=PlatformEnum.NIKE;
} else if(targetUrl.contains("www.afri-eshop.com") && targetUrl.contains("/products/")) {
platformEnum=PlatformEnum.AfriEshop;
} else if (targetUrl.contains("zara.cn")) {
} else if (targetUrl.contains("zara")) {
platformEnum = PlatformEnum.ZARA;
} else if (targetUrl.contains("uniqlo.cn/") && targetUrl.contains("#/product?pid")) {
} else if (targetUrl.contains("uniqlo") && targetUrl.contains("#/product?pid")) {
platformEnum = PlatformEnum.UNIQLO;
} else if (targetUrl.contains("hm.com/m") && targetUrl.contains("productpage")) {
platformEnum = PlatformEnum.HM;
......@@ -65,15 +78,17 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum=PlatformEnum.ADIDAS;
} else if(targetUrl.contains("http://www.lily.sh.cn/webapp/wcs/stores/servlet/lilystore")) {
platformEnum=PlatformEnum.LILY;
} else if(targetUrl.contains("eifini")) {
platformEnum=PlatformEnum.EIFINI;
} else if(targetUrl.contains("wap.ur") && targetUrl.contains("product")) {
platformEnum=PlatformEnum.URBANREVIVO;
} else if(targetUrl.contains("underarmour")) {
platformEnum=PlatformEnum.UNDERARMOUR;
} else if(targetUrl.contains("abercrombie")) {
platformEnum=PlatformEnum.ABERCROMBIEFITCH;
} else if(targetUrl.contains("ochirly.com") && targetUrl.contains("p/mobile/")) {
platformEnum=PlatformEnum.OCHIRLY;
} else if(targetUrl.contains("esprit.cn/product/") && targetUrl.contains("styleNo") && targetUrl.contains("skucode")) {
} else if(targetUrl.contains("esprit") && targetUrl.contains("product") && targetUrl.contains("styleNo") && targetUrl.contains("skucode")) {
platformEnum=PlatformEnum.ESPRIT;
} else if(targetUrl.contains("levi.com") && targetUrl.contains("product")) {
} else if(targetUrl.contains("levi.com") && targetUrl.contains("product") && targetUrl.contains("styleNo")) {
platformEnum=PlatformEnum.LEVI;
} else if(targetUrl.contains("moco.com/moco/")) {
platformEnum=PlatformEnum.MOCO;
......@@ -81,8 +96,28 @@ public class SpiderServiceImpl implements SpiderService {
platformEnum = PlatformEnum.MASSIMODUTTI;
} else if (targetUrl.contains("coach")) {
platformEnum = PlatformEnum.COACH;
} else if (targetUrl.contains("revolve")) {
platformEnum = PlatformEnum.REVOLVE;
} else if (targetUrl.contains("vans.com") && targetUrl.contains("wap/product")) {
platformEnum = PlatformEnum.VANS;
} else if (targetUrl.contains("oysho") && (targetUrl.contains("origenId") || targetUrl.contains("colorId")) ) {
platformEnum = PlatformEnum.OYSHO;
} else if (targetUrl.contains("stradivarius")) {
platformEnum = PlatformEnum.STRADIVARIUS;
} else if (targetUrl.contains("maje")) {
platformEnum = PlatformEnum.MAJE;
} else if (targetUrl.contains("gucci")) {
platformEnum = PlatformEnum.GUCCI;
} else if (targetUrl.contains("burberry.com")) {
platformEnum = PlatformEnum.BURBERRY;
} else if (targetUrl.contains("prada.com") && targetUrl.contains("products")) {
platformEnum = PlatformEnum.PRADA;
} else if (targetUrl.contains("fendi")) {
platformEnum = PlatformEnum.FENDI;
} else if (targetUrl.contains("apple") && targetUrl.contains("buy")) {
platformEnum = PlatformEnum.APPLE;
} else if (targetUrl.contains("louisvuitton")) {
platformEnum = PlatformEnum.LOUISVUITTON;
}
return platformEnum;
}
......
......@@ -31,6 +31,7 @@ import java.util.regex.Pattern;
public class HttpClientUtil {
private static Logger logger = LoggerFactory.getLogger(HttpClientUtil.class);
/**
* 获取url链接中的content 内容
* @param sourceUrl 目标url
......@@ -85,6 +86,7 @@ public class HttpClientUtil {
httpClient.close();
return content;
}
/**
* 获取url链接中的content 网页内容
* @param url 目标url
......@@ -168,18 +170,31 @@ public class HttpClientUtil {
return HttpClients.custom().setDefaultHeaders(headerList).setDefaultCookieStore(cookieStore).build();
}
//发送请求
/**
* 发送请求
* @param url
* @param paramMap
* @param charset
* @return
*/
public static String createConnection(String url,Map<String, Object> paramMap,String charset) throws IOException {
List<NameValuePair> formparams = setHttpParams(paramMap);
String param = URLEncodedUtils.format(formparams, charset);
HttpGet httpGet = new HttpGet(); //构建一个GET请求
// 构建一个GET请求
HttpGet httpGet = new HttpGet();
httpGet.setURI(URI.create(url + "?" + param));
CloseableHttpClient httpClient=createBrowserClient();
HttpResponse sibResponse=httpClient.execute(httpGet);
HttpEntity sibResult = sibResponse.getEntity();//拿到返回的HttpResponse的"实体"
// 拿到返回的HttpResponse的"实体"
HttpEntity sibResult = sibResponse.getEntity();
return EntityUtils.toString(sibResult);
}
/**
* 设置 Http 参数
* @param paramsMap
* @return
*/
private static List<NameValuePair> setHttpParams(Map<String, Object> paramsMap) {
List<NameValuePair> list = new ArrayList<>();
for (Map.Entry<String, Object> stringObjectEntry : paramsMap.entrySet()) {
......@@ -216,7 +231,8 @@ public class HttpClientUtil {
* @throws IOException
*/
public static String sendPostWithBodyParameter(String url,Map<String,Object> paramMap) throws IOException {
HttpPost httpPost = new HttpPost(); //构建一个Post请求
//构建一个Post请求
HttpPost httpPost = new HttpPost();
httpPost.setURI(URI.create(url));
httpPost.setHeader("Content-Type", "application/json");
//放参数
......@@ -241,10 +257,10 @@ public class HttpClientUtil {
* @param chartSet
* @return
*/
public static String urlEncode(String url,String chartSet)
{
public static String urlEncode(String url,String chartSet) {
try {
Matcher matcher = Pattern.compile("[^\\x00-\\xff]").matcher(url);//双字节,包括中文和中文符号[^\x00-\xff] 中文[\u4e00-\u9fa5]
//双字节,包括中文和中文符号[^\x00-\xff] 中文[\u4e00-\u9fa5]
Matcher matcher = Pattern.compile("[^\\x00-\\xff]").matcher(url);
while (matcher.find()) {
String tmp=matcher.group();
url=url.replaceAll(tmp,java.net.URLEncoder.encode(tmp,chartSet));
......
......@@ -18,10 +18,15 @@ import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/**
* 处理 Json 的工具类
*
* @see Jsoup 处理 Json 的工具包
* @version 1.0
*/
public class JsoupUtil {
public static String unknow = "未知";
private static Logger logger = LoggerFactory.getLogger(JsoupUtil.class);
/**
......@@ -91,12 +96,10 @@ public class JsoupUtil {
for (DataNode dataNode : element.dataNodes()) {
String dataStr = dataNode.getWholeData();
// 获取带有 g_config 变量的 script 标签
Pattern p = Pattern.compile("(" + variableName + "){1,1}\\s*={1,1}[\\s\\S]*(;){1,1}"); // Regex for the value of the key
Matcher m = p.matcher(dataStr); // you have to use html here and NOT text! Text will drop the 'key' part
Pattern p = Pattern.compile("(" + variableName + "){1,1}\\s*={1,1}[\\s\\S]*(;){1,1}");
Matcher m = p.matcher(dataStr);
while ((m.find())) {
//System.out.println(m.group());
configGroup = m.group();
}
}
}
......@@ -119,8 +122,8 @@ public class JsoupUtil {
for (DataNode dataNode : element.dataNodes()) {
String dataStr = dataNode.getWholeData();
//获取带有 g_config 变量的 script 标签
Pattern p = Pattern.compile("(TShop.Setup){1,1}[\\s\\S]*?(\\);){1,1}"); // Regex for the value of the key
Matcher m = p.matcher(dataStr); // you have to use html here and NOT text! Text will drop the 'key' part
Pattern p = Pattern.compile("(TShop.Setup){1,1}[\\s\\S]*?(\\);){1,1}");
Matcher m = p.matcher(dataStr);
while ((m.find())) {
//System.out.println(m.group());
configGroup = m.group();
......@@ -134,6 +137,7 @@ public class JsoupUtil {
}
/**
* 从返回的字符串中获取变量的值
*
* @param needInfo
* @param configStr
......@@ -141,15 +145,15 @@ public class JsoupUtil {
*/
private static void getInfoFromJsStr(List<String> needInfo, String configStr, Map<String, String> returnMap) {
for (String info : needInfo) {
//获取 相关信息
// 获取 相关信息
String patternStr = "((" + info + "){1,1}?)\\s+(:{1,1}?)[\\s]*(('){1,1}?)[\\S]*(('){1,1}?)";
Pattern infoPattern = Pattern.compile(patternStr); // Regex for the value of the key
Pattern infoPattern = Pattern.compile(patternStr);
Matcher infoMatcher = infoPattern.matcher(configStr);
while (infoMatcher.find()) {
int infoBegin = infoMatcher.group().indexOf("'");
int infoEnd = infoMatcher.group().lastIndexOf("'");
String infoValue = infoMatcher.group().substring(infoBegin + 1, infoEnd);
//获取 infoValue
// 获取 infoValue
returnMap.put(info, infoValue);
}
}
......@@ -222,16 +226,17 @@ public class JsoupUtil {
}
/**
* 根据script id获取内容
* @param content
* @param id
* @return
* 根据 script id获取内容
*
* @param content 网页内容
* @param id <script>标签的id
* @return <script>标签内的网页内容
*/
public static JSONObject getScriptContentById(String content, String id) {
Document document = Jsoup.parse(content);
Element element = document.getElementById(id);
String dataStr=element.data();
JSONObject dataMap= JSONObject.fromObject(dataStr);
String dataStr = element.data();
JSONObject dataMap = JSONObject.fromObject(dataStr);
return dataMap;
}
......@@ -253,16 +258,22 @@ public class JsoupUtil {
/**
* 获取指定网页内容的 script 标签内的变量值</br>
* 仅限于格式:</br>
* window.produictId = 290000;
* -- 注意:
* 1.仅限于格式:</br>
* window.produictId = 290000;</br>
* 2.变量名在网页内容中是唯一的
* @param content 网页内容
* @param variableName 变量名
* @return 变量的值
*/
public static String getScriptTagVariableContent(String content, String variableName) {
String detailStr = getScriptContent(content, variableName);
String[] spilt = detailStr.split("=");
return spilt[1].replaceAll(";", "").trim();
int firstBrackets = detailStr.indexOf(variableName);
int lastbrackets = detailStr.indexOf(";");
detailStr = detailStr.substring(firstBrackets, lastbrackets)
.replaceAll(variableName, "")
.replace("=", "").replace("'", "").trim();
return detailStr;
}
/**
......@@ -279,5 +290,6 @@ public class JsoupUtil {
}
return map;
}
}
package com.diaoyun.zion.master.util;
import com.diaoyun.zion.chinafrica.service.TbCfFeeService;
import java.math.BigDecimal;
import java.util.regex.Pattern;
/**
* 用于爬虫的数据处理的工具类
*
* @author 爱酱油不爱醋
*/
public class SpiderUtil {
private static BigDecimal rate;
static {
TbCfFeeService tbCfFeeService = (TbCfFeeService) SpringContextUtil.getBean("tbCfFeeService");
rate = tbCfFeeService.getRateFee().getFeeRate();
}
/**
* 转换汇率--由人民币转换为美元
* TODO 同步汇率问题
* 注意:暂时从后台获取人工设置的汇率,还未是动态获取
* @param fullPrice 原始价格
* @return 汇率计算的价格
*/
public static String exchangeRate(String fullPrice) {
return new BigDecimal(fullPrice).divide(rate, 2, BigDecimal.ROUND_UP).toString();
}
/**
* 去除除了数字之外的所有字符
*
* @param str 字符串
* @return 只有数字的字符串
*/
public static String retainNumber(String str) {
str = Pattern.compile("[^0-9]").matcher(str).replaceAll("").trim();
return str;
}
}
package com.diaoyun.zion.master.util;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.DocumentHelper;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.XMLWriter;
import org.json.JSONException;
import org.json.JSONObject;
import org.json.XML;
import java.io.IOException;
import java.io.StringWriter;
/**
* 处理 XML 文件格式的工具类
*
* @author 爱酱油不爱醋
*/
public class XmlUtils {
/**
* 将 xml 格式的数据转换为 Json 格式
* @param xml xml 格式的字符串
* @return Json 格式的字符串
*/
public static String convertXmlIntoJSONObject(String xml) {
JSONObject jsonObject = new JSONObject();
try {
Document xmlDocument = DocumentHelper.parseText(xml);
OutputFormat format = new OutputFormat();
format.setEncoding("UTF-8");
format.setExpandEmptyElements(true);
StringWriter out = new StringWriter();
XMLWriter writer = new XMLWriter(out, format);
try {
writer.write(xmlDocument);
writer.flush();
} catch (IOException e) {
e.printStackTrace();
}
//out.toString() 此结果为xml的<a></a>格式
jsonObject = XML.toJSONObject(out.toString());
} catch (DocumentException e1) {
e1.printStackTrace();
} catch (JSONException e) {
e.printStackTrace();
}
return jsonObject.toString();
}
}
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论